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CHAPTER  I 


INTRODUCTION’  AND  SUMMARY 

This  dissertation  is  concerned  with  the  control  of  a  one- 
dimensional  diffusion  process  The  concept  of  a  controlled  diffusion 
process  used  here  is  almost  identical  to  that  defined  by  Mandl  [16] 

The  system  we  both  consider  is  a  stationary  diffusion  process  defined 
on  a  compact  interval  of  the  real  line  The  drift  and  diffusion  coeffi¬ 
cients  depend  upon  a  stationary  control  which  is  state  dependent  The 
costs  generated  by  the  process  are  functions  of  both  the  control  and 
the  sample  path  of  the  process  Mandl' s  concept  of  a  controlled  diffu¬ 
sion  process  is  generalized  by  allowing  controls  to  be  vet  tor-valued 
with  the  set  of  admissible  control  actions  defined  by  a  piecewise 
continuous  set-valued  function  on  the  state  space 

If  the  process  is  controlled  by  a  single  person  and  the  process 
generates  a  single  stream  of  costs,  then  we  have  the  ordinary  optimal 
control  problem  of  Chapter  II.  These  problems  involve  finding  controls 
which  minimize  expected  costs  In  the  i  aso  of  undiscounted  costs. 

Mandl' s  results  ar.’  generalized  to  account  for  our  previously  mentioned 
restriction  on  admissible  control  functions  With  discounting,  the 
minimal  expected  discounted  cost  as  a  function  of  che  initial  state  of 
the  process  is  shown  to  be  the  unique  solution  ot  a  differential  equation 
A  necessary  and  sufficient  condition  is  given  for  a  control  to  yield  the 
minimal  expected  discounted  cost,  and  a  method  is  presented  for  computing 


this  optimal  control 

In  Chapter  III,  we  suppose  the  process  is  controlled  by  two 
persons,  one  player  who  chooses  one  component  of  the  control  wants  to 
minimize  the  expected  cost,  while  the  second  player  who  chooses  the 
other  control  component  wants  to  maximize  the  expected  cost  The 
problem  of  controlling  this  process  is  a  sequential,,  zero  sum,  two-person 
game.  Moreover,  by  an  optimal  control  is  meant  a  saddlepoint  of  the 
expected  cost  function-  The  expected  cost  associated  with  a  saddlepoint 
is  called  the  value  of  the  game.  The  value,  if  it  exists,  is  shown  to 
be  the  unique  solution  of  a  differential  equation  Furthermore,  a 
necessary  and  sufficient  condition  is  given  for  a  control  to  be  optimal 

If  the  diffusion  process  is  controlled  by  N  persons  and  it  gen¬ 
erates  N  streams  of  costs,  then  the  problem  oi  controlling  the  process 
bcones  a  sequential,  non-zero  sum,  N-person  game.  Suppose  the  i^ 
player,  who  operates  the  it'1  control  component,  wants  to  minimize  the 
expected  costs  of  the  iCl1  cost  stream.  Then  by  an  optimal  control 
will  be  meant  one  that  is  a  Nash  equilibrium  point  of  the  expected  costs 
corresponding  to  all  admissible  controls  In  Chapter  III  each  Nash 
equilibrium  point  is  shown  to  be  the  unique  solution  of  a  differential 
equation,  and  a  necessary  and  sufficient  condition  is  given  for  a  con¬ 
trol  to  be  optimal 

A  piecewise  continuous  optimal  control  may  not  exist  for  a 
problem  of  any  t>pe  mentioned  above  In  Chapter  II,  a  piecewise  con¬ 
tinuous  optimal  control  is  shown  to  exist  for  the  single  person  control 
problem  under  a  variety  of  alternative  conditions.  One  is  that  the 
action  set  in  each  state  is  finite  and  state  independent,  and  the  drift, 


diffusion,  and  continuous  movement  costs  are  analytic  in  the  state  A 
second  is  that  the  action  set  in  each  state  be  convex  the  diffusion 
coefficient  be  action  independent,  the  drift  coefficient  be  affine  in 
the  action,  and  the  continuous  movement  cost  be  sti.icc.ly  convex  in  the 
action  Conditions  of  the  latter  type  are  also  given  for  multiperson 
control  problems. 

Chapters  II  and  III  provide  several  applications  of  the  models 
developed  herein  The  ordinary  optimal  control  model  is  applied  to  the 
problems  of  controlling  a  reservoir,  controlling  pollution,  optimizing 
a  queueing  system,  and  making  optimal  investments  The  zero  sum,  two- 
person  game  model  is  applied  to  the  problem  of  determining  how  many 
people  should  be.  receiving  welfare  The  non-zero  sum  N'-person  game  model 
is  applied  to  the  problems  of  pollution  and  warfare  Chapters  II  and 
III  also  include  numerous  example  calculations  ol  the  optimal  control  for 
each  type  of  model;  some  of  these  examples  pertain  to  the  applications 


that  are  discussed 


CHAPTER  II 


SINGLE  PERSON  CONTROLLED  DIFFUSIONS 


This  chapter  describes  a  class  of  single  person  controlled  one¬ 
dimensional  diffusion  processes  where  the  control  is  a  vector-valued 
function  on  the  state  space  These  processes  generate  costs,  and  the 
optimal  control  problem  is  to  choose  an  admissible  control  that  mini¬ 
mizes  expected  costs  The  following  three  sections  provide  results  for 
discounted  costs-  undiscounted  costs  with  n  non-conservative  process,  and 
undiscounted  costs  with  a  conservative  process  Generalizing  Mandl 
[16],  the  major  results  in  these  sections  are  necessary  and  sufficient 
conditions  for  a  control  to  be  optimal  as  well  as  characterizations  of 
the  expected  costs  corresponding  to  an  optimal  control  The  method  for 
solving  a  problem  is  basically  the  same  in  each  case  A  differential 
equation  is  solved  and  the  solution  is  used  to  deternine  the  optimal 
control 

Section  4  establishes  sufficient  conditions  for  the  existence  of 
piecewise  continuous  optimal  controls  The  remaining  four  sections 
discuss  four  potential  applications  of  single  person  controlled  diffu¬ 
sion  processes  Each  such  section  describes  how  these  processes  can 
be  used  as  models  of  the  physical  systems  being  considered,  and  then 
examples  are  provided  to  demonstrate  how  actual  problems  could  be 


solved 


1  The  Controlled  Diffusion  Process. 


Consider  a  diffusion  process  with' statu -space  S,  a  compact 
interval  [r^.r^]  of  the  real  line  11,  which  is  controlled  by  a  single 
person  For  some  positive  integer  n  and  some  compact  set  K  C  lin,  the 
control  is  a  vector-valued  function  on  S  with  range  K  Let  Ag  be 

a  polnt-to-set  map  from  S  into  K  such  that  Ag  is  piecewise  contin¬ 
uous  in  the  Hausdorff  metric  and  for  each  s  c  S  the  set  A  is  a  non- 

s 

empty  compact  subset  of  K  .  Each  time  the  process  is  observed  in  state 
s,  an  action  "a"  is  chosen  from  the  set  A  The  set  M  of  adnissi- 
ble  controls  consists  of  all  plecewisn  continuous  functions  a(  )  on  S 
with  range  in  Kn  such  that  the  action  a(s)  r  A  for  each  s  c  S  A 

i 

function  a(  )  is  an  admissible  control  ii  and  only  if  a(  )  t.  '1 
Throughout  this  chapter  it  should  be  clear  from  the  context  whether  the 
letter  a  denotes  an  admissible  control  a  °  a (  )  •-  M  or  an  admissible 
action  a  c  Ag  tor  some  s  c  S  In  the  sequel  we  shall  assume  A  is 
such  that  .'1  is  non-void  Although  this  is  always  true  whenever  K  C  E, 
it  is  not  known  whether  in  general  M  t  i  if  K  En ,  n  2 

In  order  to  ch.n  uctet  ize  the  nap  A  ,  lot  the  set  7.  consist  o f 


all  compact  subsets  oi  K  We  define  a  metric  e  on 


is  follows  For 


any  a  c  E  and  Y  •_  Z  let  !)(a,Y)  be  the  Euclidean  distance  between  the 
poir  t  a  and  the  set  Y  For  any  Y^,Y.,  -  7.  lot 


(V..Y,) 


sup  D(a ,  Y  >)  +  sop  D(a,Y  ) 

a-Y x  ‘  a  Y, 


Then  (Z,;)  is  a  meitic  space  vith  the  Hau.^uon  :  netrit ,  and  the  nap 


j 


i-  from  S  into  (7.. p )  is  continuous  at  s  for  all  but.  a  finite 

s 

number  of  s  e  S  , 

It  can  be  shown  (for  example,  Hogan  [12])  that  A  is  continuous 

5 

in  the  Hausdorff  metric  at  the  point  s  if  and  only  if  is  both 

upper  semi-continuous,  that  is,  a  closed  map,  and  lower  semi-continuous 

OO  c»  2,  or 

at  s  The  map  Ag  is  upper  semi-continuous  at  s  If  (1)  s  +  s  , 

(il)  a*  -*  3°",  and  (lii)  a'*'  e  A  ^  together  imply  a  c  A  M  The 

s  "  s 

i  cx 

map  A  is  lower  sumi-continuous  at  s  if  (i)  a  •>  s  and 
s 

(il)  a  c  A  together  imply  there  exists  a  sequence  ot  actions 
s 

1  i  * 

a  £  A  ^  such  that  a  -  a 

s 

The  definition  of  a  controlled  diffusion  process  is  a  slight 
generalization  of  Mandl  s  [16,  p.  157]  Let  u(s,n)  be  a  continuous 
positive  real-valued  function  on  S  *  K  Then  for  a(  )  r.  M  the 
piecewise  continuous  function  d(s,a(s))  is  the  diffusion  coefiicient 
of  the  process  Similarly,  let  b(s,i)  be  a  continuous  real-valued 
function  on  S  *  K  so  that  b(s,a(s))  is  the  drift  coefficient  of 
the  diffusion  process 

Following  Mandl,  with  a  given  control  a (  )  c  M  the  diffusion 
process  is  completely  specified  by  the  generalized  classical  differen¬ 
tial  operator 


1) 


d  (s  ,  a  (s  ) ) — 5- 
ds" 


b  (s  .  a  ( s ) )  -  - 
us 


logechor  with  Felder's  [7,9]  boundary  condition 


A5?o?>-giawtta-J»»,i  ,r>«J^s>  4*  >  !5rSjNs>  fe^«sse£?£^££  * 


“jv<ra>  +  6i(v(v  -  J',M- j 


(s) 


-  (-1)J», V  (r.) 

J  J 


+  c  (Dv)(r  )  -  0  ,  j  -  0.1, 


where  v(s)  is  some  function  whose  second  derivative  is  piecewise 
continuous  on  S  At  each  boundary  rQ,r^  the  tout  non-negative 


parameters  <j,  c-j ,  and  c^  ,  at  least  one  of  which  must  be  positive, 


correspond  respectively  to  the  phenomena  of  absorption,  adhesion,  reflec¬ 
tion  and  instantaneous  return  Cor  responding  to  0  is  the  probability 


distribution  function  ~,(s)  whore 

j 


1 1VS)  * 1 


Feller  [8]  and  Ito  and  McKean  [14]  present  partial  probabilistic 
interpretations  of  these  boundaiy  conditions  in  the  case  of  diffusion 
processes,  and  Ito  and  McKean  [13]  present  a  complete  description  in  tire 
case  of  Brownian  motion.  Their  results  are  briefly  described  here. 


The  reflecting  barrier  process  with  ■»  ■»  e  -  0  «  0  tor 

J  J  J 


j  ■  0,1  can  be  described  by  constructing  a  diffusion  process  on  L 
U'^-th  L  °  -  Tq  define  the  pomt-tu-set  map  1  S  •  V.  as 


f(s)  *  is  t  l  I  :t  ■  s  *  2nL 


or  n  •>  ci 


+  2nL 


lor  some  n  »  0  1  2 


Note  that  U  l(sJ  ■  !  and  the  ir..t::si 
£  S 

equals  .*  unique  s  s  •  ei  vuv. i»  >.  K". 


(>.  ) 


*  s  ) 


the  dr;:  .  me  Jit;  ■  -  j  a  •  t  ; . . 


i^ut 


equal  to  the  drift  and  diffusion  coefficients,  respectively;  of  the 
reflecting  barrier  process  r.t  the  point  t  ^"(x)  If  the  constructed 
process  is  represented  by  the  sample  path  t  -  >:(t)  then  the  sample 
path  s+(t)  a  f  ^(x(t))  represents  the  reflecting  barrier  process, 

A  diffusion  process  s(t)  without  reflecting  barriers 
("q  =  =  0)  behaves  like  the  reflecting  b^mcr  process  s  +  (t)  up  to 

the  first  passage  time  rr.  =  min{t  j  s+(t)  °  or  s+(t)  ■  r^}  Then, 
if  s+(n)  -  r  ,  s(t)  =  Tj  for  an  exponential  holding  time  with 

conditional  law 


P(e. 


T  ;  s*) 


‘rV'b 


At  time  m  +  either  the  process  terminates  (absorption)  with  proba¬ 
bility  or  starts  afresh  by  jumping  to  the  point 

0  c  (r^.r^)  with  conditional  law 


P  ( s  (si;  +  )  = 


e  ,s+)  = 


i.u,(dc)/(-:  + 

j  i  j 


The  interpretation  of  the  boundary  condition  of  reflection  com¬ 
bined  with  absorption  and/or  adhesion  f  c  >  •  0 .  «  0)  is 

1  i  1  j 

rather  more  complicated.  Brieflv,  with  c  ■  0  the  process  behaves 

j 

like  the  reflecting  barrier  process  with  a  stochastic  time  scale  change 

that  counts  standard  t  ir,.e  while  s+(t)  t  r  but  luns  slow  on  the 

J 

carrier  with  the  result  that,  compared  to  the  reflecting  barrier  process , 
this  process  lingers  at  the  boundary  longer  than  it  should.  With 


.  '  0  the  process  behaves  is  if 


is  killed 


r  lie 


—  s** 


boundary  r  at  a  random  time  that  it  a  function  o£  the  visiting  set 
it  |  s(t)  *  r  }  ,  If  *  (<  +  0^)  >  0  and  the  passage  time 
m(:)  =  infit  >  t  |  s(t)  i4  r  }  then  the  conditional  probability 


P(m(-)“0|s(x)=r^)  -  1 


The  process  with  both  reflection  and  instantaneous  return  occurring  at 
a  boundary  >  can  constructed  from  a  process  with  both 

reflection  and  absorption  occurring  at  the  boundary  (r  ..  0)  as  was 

done  by  Mandl  (16.  pp  64-66] 

The  diffusion  process  generates  costs  according  to  its  sample 
path  and  control  (Mandl  [16.  p  148j)  These  costs  are  of  three  types 

Tha  continuous  movement  cost  is  the  cost  rate  per  unit  time  Let 
c(s,a)  be  a  continuous  function  from  S  *  K  into  E  It  s(t)  is 
the  sample  path  of  the  process  and  a(s)  the  control  then  the  integral 
of  c (s (t ) ,a(s (t) ) )  over  a  time  interval  equals  the  total  continuous 
movement  cost  generated  ever  this  time  intervaJ 

The  second  kind  ct  cost  is  associated  with  jumps  ‘.instantaneous 
returns)  by  the  p-ocess  from  the  boundaries  For  j  -  0  i  let  (s) 
be  a  function  from  S  into  E  which,  is  integrable  with  respect  to 

(s)  ,  If  the  process  jumps  from  boundary  r.  to  the  point 

» 

s  t.  at  r  then  there  arises  at  this  time  the  jump  cost 

Vj(s)  Denote  by  $^(t  s)  t  _  0.  s  c  S,  j  =  0,1  the  integer-valued 
random  variable  representing  the  number  of  jurr.ps  made  by  the  process 


up  through  time  t  from  boundary  r^  into  the  mtecvai  [r^.s] 

Then  the  total  cost  due  to  jumps  from  t  up  inrough  time  t  equals 

j 


-i'  *ir 


t  r 


the  integral  over  [rn,r,]  of  v  ,  <s)b , (t ,ds)  . 

U  x  j  J 

The  third  kind  of  cost  depends  upon  the  termination  of  the 
process.  If  the  process  is  absorbed  at  boundary  r  ,  then  at  this 
termination  time  there  arises  the  cost  1  ,  j  °  0,1  . 

Following  Mandl  (16,  p,  149],  if  C(t)  is  the  total  of  the  costs 
generated  by  the  process  up  through  time  t,  then  the  Lsplace-Stieljes 
transform 


oc 


/ 


e 


CdC(t) 


can  be  regarded  as  the  total,  discounted,  infinite  horizon  cost  gener¬ 
ated  by  tne  process,  where  the  discount  factor  is  e  and  \  >  0 
Given  a  controlled  diffusion  process,  admissible  control,  and  discount 
factor,  let  v(s)  denote  the  conditional  expectation  of  the  discounted 
cost  of  this  process  given  its  initial  state  s.  that  is 

O' 

v(s)  *>  J"  e  tdC(t) 

0 

Jiandl  (16,  p  149]  proves  the  following  result. 


Theorem  1  The  expected  discounted  cost  v(s)  corresponding  to 

a(-)  c  M  is  the  unique  function  on  S  such  that  v*  (s)  is  continuous, 

(1)  d (s ,a(s) ) v" (s)  +  bts,a(s))v*  (s)  -  v(s)  +  c(s,a(s))  ■  0 

holds  for  every  s  c  (r^,r  )  which  is  a  continuity  point  of  n(.s),  and 


-]  0- 


1 

I 

] 

1 

1 

1 

3 


t 

i 

•jV 

1 

-l 

1 

1 

1 


.i-'tirpfSPHf 


ilV?  ■'TiSeM^lW WgigatMg<WBa»vViM«: 


I 

I 

1 


(2)  (ej  +  ifjMrj)  -  0J  J(v(s)  +  Vj(s))duj(s)  -  (-l)j  ^  v’  (r  ^  ) 


+  Oj  (Av (rj )  -  c(rj,a(r  )))  -  KjX^  =  0,  j  «  0,1 


If  the  process  is  non-conservative  and  neither  boundary  is  purely 
adjesive,  that  is. 


<0  +  *1  >  0  •  *4  +  *4  +  64  >  0  *  J  "  0,1, 


J  i  i 


> 

i 

i 


then  Mandl  [16,  p.  152]  shows  that  the  expected  total  undiscounted 
cost  v(s)  ■  EaC(”)  is  finite  and  is  the  unique  solution  of  (1)  and 
(2)  for  X  -  0  . 

If  the  process  is  conservative  «  0),  then  the  total 

undiscountsd  cost  ruay  oe  infinite.  The  number  0  in  the  following 
theorem  by  Mandl  [It,  pp.  152-157,  168]  can  be  interpreted  as  the  mean 
cost  per  unit  time. 


Theorem  2  Let  <q  •  «:  ■  0  and  assume  at  least  one  boundary  i9  not 
purely  adhesive,  chat  is,  +  9^  +  71^  +  6^  >  0  ,  v(s,X)  is  the 

expected  discounted  cost  corresponding  to  X  >  0  and  some  a(  )  e  M, 
then 


lim  Xv(s,X)  =  £•  and 
U0 


lim  -r~v(s,  )  =  w(s)  , 

A  +  0 


-11- 


where  G  is  some  number  independent  of  the  state  s,  and  v(s)  is 
some  absolutely  continuous  function  on  S  .  Moreover, 

P<lim  t-1C(t)  -  G)  -  1 
t-** 


and  (0,w)  is  the  unique  pair  satisfying 

(3)  d(s,a(s) )w' (s)  +  b(s ,a(s))w(s)  -  0  +  c(s,a(s))  -  0 


for  every  s  e  (rQ,r^)  which  is  a  continuity  point  of  a(s),  and 


(4) 


j  J^J  w(y)dy  +  v^(s)^  di^(s)  +  (-l)^'jW(r^  ) 

S  r , 


+  c  (c(rj ,a(r ))  -  3)  -  0  j  -  0,1  . 


•  I 


1  » 


\ 


•  I 


”  I 

•  •  I 


2  -  The  Discounted  Cost  Case. 

Let  v(s,a)  =  v(s)  denote  the  expected  discounted  cost  of  a 
process  corresponding  to  the  admissible  control  a  e  M  -  Then  v(s,a) 
will  be  the  unique  solution  of  (1),  (2).  The  minimal  expected  dis¬ 
counted  cost  v(s)  is  defined  to  be 


-12- 


inf  v(s,a) 
aeM 


\Ks) 

An  admissible  control  a  e  M  is  said  to  be  an  optimal  control  if 
v(s,£t)  ■  0(s)  for  all  s  e  S  .  The  results  here  for  the  discounted 
cost  case  generalize  Mandl's  [16,  pp,  158-173]  results  for  undiscounted 
costs.  The  main  results  of  this  section  are  Theorem  3,  which  character¬ 
izes  the  minimal  expected  cost,  and  Theorem  6,  which  provides  necessary 
and  sufficient  conditions  for  an  admissible  control  to  be  optimal 

Theorem  3  The  minimal  expected  discounted  cost  v(s)  is  the  unique 
solution  of  the  equation 

(5)  v"(s)  +  min  (d(s,a)  1  [b  (s ,  a)v ' (s)  -  Xv(s)  +  c(s,a)]}  «*  0 

aeA 

s 


satisfying 


(b) 


(C- 


<j)v(r.  ) 


-  /‘ 


v(s)  + 


Vj (s) )dUj  (s) 


(-D^V'U  ) 


where 


V  /‘V  '  V '  Yj  ‘  0  • 


=  min  c  (r  f  a)  , 
**  acA 

rj 


j  ■  0,1. 

j  -  0,1 


Two  preliminary  lemma6  will  be  provided  before  the  proof  of 
Theorem  3  is  presented.  The  first  lemma,  stated  without  proof,  is  a 
slight  modification  of  a  selection  theorem  due  to  Dubins  and  Savage 
[5,  Chap.  2,16]  (see  also  Maitra  [15]),  I  am  grateful  to  Robert  Rosen¬ 
thal  for  suggesting  the  appropriateness  of  Lusin's  Theorem  in  the  proof 
of  Lemma  5, 

Lemma  4,  If  h(s,a)  is  a  continuous  real-valued  function  on  S  *  K, 
then  there  exists  a  Borel  measurable  function  f(s)  on  S  into  K 

such  that  h(s,f(s))  *  min  h(s,a)  and  f(s)  c  A  for  each  s  c  S, 

aeA  S 

s 

Lemma  5  Suppose  f(  )  is  a  function  on  S  into  K  which  is  measur¬ 
able  with  respect  to  Horel  measure  v  and  satisfies  f(s)  e  A,  for 
each  s  c  S  .  Then  for  every  c  >  0  there  exists  a  measurable  subset 
S  C  S  and  an  admissible  control  a(0  e  M  such  that  u(S  -  S)  *  c  and 
a (s )  D  f(s)  for  all  s  ^  S  . 

Proof :  Lusin's  Theorem  (eg,  Royden  (181)  is  valid  for  vector-valued 
measurable  functions,  so  for  every  c  •  0  there  exists  a  measurable 
subset  S'  C  S  and  a  continuous  function  g(s)  on  S  such  that 
u(S  -  S')  <  j  and  f(s)  “  g(s)  for  all  s  o  S'  .  It  remains  to  show 
that  g(s)  coincides  with  some  admissible  control  on  a  large  enough 
subset  of  S'  , 

Let  S"  =  {s  e  S|g(s)  c  A^  so  clearly  S'  C  S"  and 

u(S  -  S")  <  ^  Since  g(s)  and  Ag  both  have  closed  graphs,  S" 
is  closed.  Thus  there  exists  a  sequence  { S ^ )  of  disjoint  closed 
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intervals  such  that  US  “  S"  ,  Without  loss  of  generality,  assume 

i"l  1  n 

pS^  2.  2.  *  ’  * »  90  that  for  some  integer  N,  e(S  -  ^  Note 

that  there  exists  an  admissible  control  a(-)  e  M  such  that  a(s)  =  g(s) 

N  S 

for  all  e  £  .  ,S.  .  If  we  set  S  -  S' H  (  W  S,),  then 
l“i  l  lEi  l 

N  N 

u(S  -  S)  ^  p(S  -  S')  +  p(S  -  i^1S±>  -  u(S  -  S')  +  u(S  -  S")  +  p(S"  - 

c  e,  so  the  pair  (a(s),S)  is  as  desired 

Proof  of  Theorem  3.  Equation  (5)  has  a  unique  solution  v(s)  satisfying 
boundary  conditions  (6)  by  Theorem  3  of  Chapter  Ill  The  remainder  of 
this  proof  will  be  in  two  parts;  first  it  will  be  shown  that  v(s)  ^  v(s,a) 
for  all  a  e  M  , 

For  arbitrary  a  e  M  denote  v(s)  *  v(s,a)  -  v(s,a)  and  define 
£(s)  by 

(7)  <Hs)  =  v"(s)  +  d(s,a(s)>  1[b(s,a(s))v' (s)  -  av(s)] 

=  v"(5,a)  +  d(s,a(s))  ^lb(s,a(s))v' (s,a)  -  ‘-v(s,a)  +  c(s,a(s))] 

-  (v"(s)  +  d(s,a(s))”1[b(s,a(s))v' (s)  -  av(s)  +  c(s,afs))]) 

v"(s,a)  +  d(s  ,a  (s ) )  1  [b  (s  ,a  (s) )  v'  (s ,  a)  -  ‘.v(s,a)  +  c(s,a(s))] 

{v"(s)  +  min  'd(s,a)  ^ [b  (s,a)v' (s)  -  Xv(s)  +  c(s,a)]'l  -  0 

aeA 

s 

The  last  equality  follows  from  equations  (1)  and  (5),  note  by  the  first 
equality  that  p(s)  is  piecewise  continuous.  Subtracting  (6)  from  (2); 
it  can  be  seen  that  v(s)  also  satisfies 


(8) 


(9j  +  '  6j  J 0(a)di-^9)  - 

S 

+  Oj  (Xv ( r j )  -  (c(rj,a(rj))  -  y^))  -  0  ,  J  -  0,1  , 

Conclude  from  (7)  and  (8)  chat  \>(s)  is  the  expected  discounted  cost 
corresponding  to  the  controlled  diffusion  process  with  admissible  control 
a(s),  continuous  movement  cost  d(s,a(s))  **  -d  (s  ,a(a) )  ^  (s)  0, 

instantaneous  return  coat  equal  to  zero,  absorption  cost  equal  to  2ero, 
and  i(rj,a(r^))  «  c(rj,a(i  ))  -  Yj  ^  0,  j  ■  0,1  ,  Since  all  costs  arc 
non-negative  it  is  apparent  that  tf(s)  *  v(s,a)  -  V(s)  0  for  all 
s  e  S  - 

The  final  part  of  this  proof  is  to  show  C(s)  =  inf  v(s,a)  by 

aeM 

demonstrating  the  existence  of  a  sequence  {a  (•))  of  admissible  controls 

with  the  property  that  v(s,an>  -*  0(s)  as  n  -*■  «  for  all  s  t  S  •  By 

Lemma  4  there  exists  a  Borel  measurable  function  f(-)  from  S  into 

K  such  that  f  ( s )  c  A  and 

a 

d  (s ,  f  (s ) )  1[b(s,f  (s))v’  (s)  -  >.v(s)  +  c(s,f  (s)  )  ] 

“  min  d(s,a)  ^  [  b(s  ,  a)ir'  (s )  -  '.0(a)  +  c(s,a)) 
acA 

s 

for  each  s  e  S  By  Lemma  5  there  exists  a  sequence  of  admissible 
controls  that  converges  in  measure  to  f(s)  =  Now  some  subsequence  must 
converge  almost  everywhere  to  f(s),  so  there  exists  a  sequence  (a  (•)' 
of  admissible  controls  that  converges  a<  e-  to  f(  )  Also,  we  can 
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assume  a  (r.)  f  A  is  such  that  o,(c(r,,a  (r,))  -  y.)  a  0  for 
n  J  r  ^  j  j  n  j  j 

n  “  1, 2, '  •  •  and  j  ■  0,1  . 

Denote  v  (s)  ■  v(s,a  )  -  0(s)  and  define  f  (s)  as  in  (7)  by 
n  n  n 

^n(s)  ■  \HJ(s)  +  d  (s,a^(s)  )~^  [b  (s  (s) )  v^(s)  -  Xvn(s)] 


=  -{^"(s)  +  d(s,a  (8))  ^(b(s,a  (s))v'(9) 

n  n 


Av(s)  +  c (s ,a^(s) ) ) > 


Note  that  the  piecewise  continuous  function  -’vn(s)  converges  almost 
everywhere,  to 


V"(s)  +  d(s,£(s))-1[b(s,f (s))v' (s)  -  Xv(s)  +  c(s,f(s))] 

9"(s)  +  min  {d  (s,a)  ^[b(s,a)v' (s)  -  *v(s)  +  c(s:a)])  3  0 

acA 


As  in  (8)  we  see  that  v  (s)  must  satisfv 

n 


(6j  +  <j)vn(rj)  “  6j  J"vn(s)dUj(s)  -  (-1)^  r’Av’(rj) 


+  Cj'W^rj)  =  °  1 


J  -  0,1 


Thus  v  (s)  is  the  expected  discounted  cost  of  a  controlled  diffusion 
n 

process  with  control  c  >1,  zero  jump,  stopping  and  adhesion  costs, 

and  continuous  movement  cost  -d(s,a  (s))r  (s)  Since  the  oniv  cost 
_  n  n 

& 

Note  that  the  continuous  movement  cost  here  is  no  longer  the 
composition  of  a  continuous  function  on  S  *  K  with  a  piecewise  contin¬ 
uous  control.  However,  in  view  of  Mandl  [16,  pp  148-49],  this  presents 
no  problem  since  -d(s,an(s))'i  (s)  is  piecewise  continuous  on  S 
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« I 


of  the  process  converges  a.  e.  to  zero,  we  must  have  0  (s)  -  v(s,a  )  - 

n  n 

•C(s)  0  as  n  -►  and  Theorem  3  is  proved. 

Theorem  6.  Let  s)  be  the  minimal  expected  discounted  cost  for  a 
controlled  diffusion  process.  A  control  a  c  M  is  optimal  if  and  only 
if 

(9)  d(s,a(s))~1(b(s,a<a))0'  (s)  -  A\>(s)  +  c(s,a(s))] 

“  roin  (d(s,a)  ^[bCs.alO' (s)  -  A9(s)  +  c(s,a)]} 
aeA 

s 


T  ' 


«» 


t# 


for  every  s  e  S  which  is  a  continuity  point  of  a(’)  and 


<10>  °J<c(rj*a(rj))  "  Tj>  “  0  for  J  -  °>1 

Proof.  Suppose  (9)  and  (10)  hold.  By  (5)  we  have 

V(s)  +  d  (s ,  a  (s) )  1(b(s,a(s))9' (s)  -  >.0(s)  +  c(s,a(s))]  •  0 

and  by  (6)  we  have 


(Sj  +  +  Vj(s))du^ 


(s)  -  K)JvO'(r.) 

J  j 


+  -j (  'v(rj  >  “  c (r^ ,a(r ^ ) ) )  -  ^  \  -  0  ,  j  -  0,1 


Hence  v(s)  satisfies  (1)  and  (2)  so  O(s)  »  0(s,a)  and  a(s)  is  an 
optimal  control 
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Conversely,  suppose  a  e  M  is  an  optimal  control  but  that  (.9) 
does  not  hold,  that  is,  for  soma  s  c  S  which  is  a  continuity  point  of 
a(s)  wa  have 

d(s,a(s))  1[b(s,a(s))^*  (s)  -  X0(s)  +■  c(s,a(s))] 

>  min  (d(s,a)  *(b(s,a)\>*  (s)  -  XO(s)  +  c(s.a)j) 

aeA- 
s  ' 

Defining  i^(s)  as  in  (7)  we  note  that  i^(s)  <  0  in  soma  neighborhood 
of  8  .  Using  the  arguments  following  (7)  and  (8),  we  conclude 
v(s,a)  >  0(s) ,  which  is  a  contradiction. 

Finally,  suppose  a  c  M  is  an  optimal  control  and  (9)  holds, 
but  (10)  does  not.  Using  the  arguments  following  (7)  and  (8)  again  we 
have  that  v(s,a)  -  0(s)  is  the  expected  discounted  cost  of  a  process 
with  zero  continuous  movement,  instantaneous  return  and  absorption 
costs  but  with  positive  adhesion  costs.  Thus  v(r^a)  >  V(r^)  tor 

i 

j  °  0  and/or  j  «  1,  a  contradiction,  and  Theorem  6  is  picved 

The  minimal  expected  discounted  cost  and  an  optimal  control  may 

in  principle  be  calculated  for  a  process  as  follows.  Define  the  lunction 

2 

f  (s,y,z)  from  S  K  E  into  K  such  that  f(s,y,z)  t  A  . 

Cj  (c(rj ,f (rj ,y,z) )  -  Yj)  c  0  ,  for  j  "  0,1  and 

■  I 

d  (s ,  f  (s  ,y ,  z) )  1[b(s,f  (s,y,z))z  -  >.y  +  c  (s ,  t  (s  ,y  ;  z ) )  J 
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-  min  (d(s,a)  ^[b(a,ft)z  -  Ay  +  c(s,a)]) 
aeA 

s 

for  all  appropriate  s  c  S  and  y,  z  e  E  .  Than  the  minimal  expected 
discounted  cost  0(s)  will  be  the  unique  solution  on  S  to 

d(s,f (s,V(s) ,0'  (s))0"(a)  +  b(s,f (a, 0(a) (s)))0' (u) 

-  A0(s)  +  c(s,f  (s,0(s)  (a)))  -  0 


satisfying  (6)  .  The  function  f (s,v(s) ,v' (s))  on  S  will  then  be 
an  optimal  control  provided  it  is  admissible,  that  is,  piecewise  contin¬ 
uous  . 

The  following  example  demonstrates  that  the  optimal  control  is  a 
function  of  the  discount  factor  A  . 


Example, 


Ag  •  {aeE|ks>_|a|},  k  >  0  , 

d(s,a)  •  d  >  0  , 


b(s,a)  -  ba/s  ,  b  *  0 


c(s,a)  “  c  >  0  , 

boundary  condition:  v' (r^)  -  0  (reflection), 
r^  boundary  condition:  v(r^)  •  A^  (absorption  with  cost  '^)  , 
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Upon  substitution  into  (5)  one  observes  that  the  optimal  action  assumes 
either  the  maximum  or  minimum  value  as  the  derivative  of  the  minimal 
expected  discounted  cost  is  respectively  negative  or  positive.  If 
v(r^)  «  c/X,  then  the  unique  solution  of  (5)  is  <Ks)  ■  c/A  .  If. 
y(r^)  >  c/X  and  a(s)  =  -ks,  then  <? *  (s)  ^  0  so  a(s)  «  -ks  is 
optimal  and  v(r^)  v(r^)  >  c/X  .  Similarly r  v(rg)  <  CA  implies 

a(s)  »  ks  is  optimal  and  v(r^)  <_  v(rg)  ,  Since  v(r^)  °  X^  deter¬ 
mines  v(r^)  uniquely,  we  conclude  that  the  optimal  control  is  given 
by 


a(s) 


-ks  , 

if 

Xx  >_  c/X 

ks  , 

if 

X1  £  c/X 

3=  The  Undiscounted  Cost  Case. 

Mandl  [16,  pp.  158-173]  provides  results  for  the  undiscounted 
cost  case  (X  =  0)  when  the  controls  are  real-valued  functions  and  the 
sets  of  admissible  actions  are  Independent  of  the  state  space,  that  is ; 
for  some  compact  K  C  E,  Ag  =  K  for  all  s  e  S  .  The  purpose  of  this 
section  is  to  generalize  his  results  in  accordance  with  the  formulation 
of  section  1. 

For  the  undiscounted  cost  case  there  are  two  situations;  either 
the  process  is  conservative  or  non-conservative.  For  the  purposes  of 
this  section,  the  boundary  conditions  are  said  to  be  non- conservative 
if  at  least  one  boundary  is  absorbing  and  neither  boundary  is  purely 
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trrrre 


adhesive,  that  is 


'0  +  “l  ’  °  •  ‘1  *  "j  +  C'j  ’  °  •  J  '  0,1 


Let  v(s,a)  denote  the  expected  cost  of  such  a  process  corresponding 
to  the  control  a  e  M  .  Then  v(s,a)  will  be  the  unique  solution  of 
(1)  and  (2)  with  X  ■  u  ,  The  minimal  expected  cost  0(b)  is  defined 
to  be 


i>(s)  ■  inf  v(s,a)  . 

acM 


An  admissible  control  a  r.  M  is  said  to  be  optimal  if  v(s)  •»  v(s,a) 
for  all  s  c  S  .  The  ":ain  result  for  non-conser\'ative  processes  and 
undiscounted  costs  is  the  following. 


Theorem  7 .  Suppose  the  boundary  conditions  are  non-conservative  Then 
the  minimal  expected  cost  V(s)  is  the  unique  solution  of 


0" (s)  +  min  ',d(',a)  [b(s,a)v'(s)  +  c(s,a)];  =  0 

ntA 


satisfying 


(12)  (•• 


■0<Kr.)  “  j  (v(s)  +  -..(s ))d-,(s)  -  (-l)j-tv'  ir.) 

-J  J  J  ,  '  J  J  j 


Vj  =  0- 


j  =  0,1, 


J 


I 

I 


] 


where  >,  ™  min  c(r  ,a)  ,  j 
3  aeA  3 

rJ 


0,1 


A  control  a  t  M  is  optimal  if  and  only  if 


(13) 


d  (s  ,a(s) )  1  [b  (s  ,  a(s)  )\> ' (s)  +  c(sra(s))j 


=  min  (d(s,a)  1 [b (s , a)v ' (s)  +  c(s,a)]1 
an  A 


for  every  s  c  S  which  is  a  continuity  point  of  a(  )  and 


1 


!  1 
I  i 


(14) 


Oj  (c(r^  ,a(rj  ))  ~  'i  j)  *  0  , 


j  -  0.1 


Proof  There  exists  a  unique  solution  v(s)  to  (11),  (12)  by  virtue  of 
Theorem  14,  Chapter  III-  The  remainder  of  the  proof  proceeds  as  with 
Theorems  3  and  6,  so  it  will  be  omitted 

For  the  purposes  of  this  section,  the  boundary  conditions  are 
said  to  be  conservative  if  neither  boundary  is  absorbing  and  at  least 
one  boundary  is  n„L  purely  adhesive,  that  is. 


J 

I 


+  v. 


Let  0(a)  denote  the  mean  cost  per  unit  time  ot  sue))  a  process  correo- 
pending  to  the  admissible  control  a  >.  M  Then  1  (a;  is  the  unique 
number  to  which  there  exists  a  solution  to  (3):  (4)  The  minimal  mean 
cos t  £  is  defined  to  be 
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0  -  inf  0(a)  . 

aeM 

An  admissible  control  aeM  is  said  to  be  an  optimal  control  if 
0(a)  »  &  .  The  main  result  for  conservative  processes  and  undiscounted 
costs  is  the  following. 

Theorem  8.  Suppose  the  boundary  conditions  are  conservative.  The 
minimal  mean  cost  is  the  unique  number  6  such  that  the  equation 

(15)  w' (s)  +  min  {d(s,a)-:!"(b(s,a)w(s)  -  C  +  c(s,a)]}  ■  0 


has  a  solution  w(’)  satisfying 

w(y)dy  +  v  (s)  dy  (s)  +  (-l)^w(r  ) 
J  J  J  J 


+  °j(Yj  "  5)  =  °  1  j  =  0>1‘ 


where  y.  »  min  c (r  ,a) ,  j  ■  0,1  . 

^  aeA  ^ 
r . 

J 

A  control  aeM  is  optimal  if  and  only  if 


(17)  d(s,a(s))"1[b(s,a(s))w(s)  -  t  +  c(s,a(s))] 

-1 

=  min  (d(s,a)  [b(s,a)w(s)  -  0  +  c(s,a)]} 

at  A 

s 

for  every  s  e  S  which  is  a  continuity  point  of  a(*)  and 


<16)  6j  /  / 

S  Lr . 
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(18) 


ycUj.att.))  -  Vj)  >  0  , 


j  -  0,1 


Proof ■  Equation  (15)  has  a  solution  satisfying  (16)  for  a  unique  number 

9  by  Theorem  16  of  Chapter  III.  If  C(a)  :  0  then  a  contradiction 

can  be  obtained  as  was  done  with  Theorem  3  of  Chapter  III  to  show  the 

uniCLty  of  v ( r q )  Using  the  reasoning  of  Theorem  3,  there  exists  a 

sequence  {a  (s ) }  of  admissible  controls  such  that  G(a  )  -  0  as 
n  n 

n  *  *=;  so  9  «  inf  C(a)  > 
a  M 

To  prove  the  necessary  and  sufficient  condition  for  a  control  to 
be  optimal,  if  (17)  and  (18)  are  true,  the  w(s)  and  0  satisfy  (3). 
(4)  so  3  =  0(a)  and  a(s)  is  optimal.  Conversely,  if  a(s)  is 
optimal  but  (17)  is  violated  at  a  continuity  point  of  a(s)  ..  then 
employing  the  reasoning  of  Theorem  3  of  Chapter  III  used  to  show  the 
unicity  of  v(Tq),  we  construct  a  process  with  negative  continuous 
movement  costs.,  non-positive  adhesion  costs,  and  zero  instancaneous 
recurn  costs  but  with  a  tero  mean  cost  per  unit  time,  a  contradiction 
Finally,  if  9  ®  o(a)  and  (17)  holds  but  (IS)  is  violateo,  then  a 
similar  contradiction  is  obtained .  and  Theorem  8  is  proved 


4  Existence_  ot  Admissible  Uptime1 _ rrMW’ois 

There  is  no  guarantee  that  an  admissible  optimal  control  wiii 
exist  ior  a  controlled  dittusion  process  A  piecewise  continuous  optimal 
control  need  not  exist  as  the  following  example  shows;  S  *  A  3  [-1  ;] 
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for  the  special  case  where  the  stare  equation  is  linear  in  the  state 
and  control  together.  The  following  results  will  serve  to  weaken  this 
linearicy  restriction,  although  it  should  be  borne  in  mind  that  che 
classical  results  are  for  N-dimensional  systems.  The  existence  theorems 
in  this  section  will  all  be  for  the  discounted  cost  case;  analogous 
results  hold  for  the  undiscounted  cost  case. 


We  say  the  function  f(s)  :  S  -*■  E  is  analytic  at  s  if  it  has 

au 

an  absolutely  convergent  power  series  expansion  f(s)  «  \  a  in 

j  »0  J 

some  neighborhood  of  s  .  The  function  f(s)  is  analytic  on  the 


Interval  C  S  if  there  exists  an  open  interval  S2  D  and  a 
function  g(s)  which  is  analytic  at  each  s  e  S,  such  that  f(s)  “  g(s) 
for  each  s  e  .  The  function  f(s)  is  piecewise  analytic  on  S  if 
S  can  be  decomposed  into  a  finite  number  of  intervals  on  each  of  which 


f(s)  is  analytic-  The  following  theorem  is  the  main  result  of  this 


section. 


Theorem  9  If  Ag  =  {1,2,’  ,N)  and  if  d(;a),  b(',a)  and  c(  ,a) 
are  piecewise  analytic  on  S  for  each  fixed  a  *>  1,2.’  ,N,  then  there 
exists  a  piecewise  constant  optimal  control. 

Proof  The  function  d(s,a)  is  positive  for  all  s  c  S  and  all 

a  =  1,  ,N  so  u(s,a)  »  d(s,a)  \  d(s,a)  «*  b(s,a)d(s,a)  \  and 
>(s,a)  *  c(s,a)d(s,a)  ^  are  piecewise  analytic  in  a  on  S  tor  each 
fixed  a  **  !,’■  ,N  .  Let  v(s)  be  the  minimal  expected  discounted  cost 
for  this  process  Let  a(s)  be  a  function  on  S  which  satisfies 
a(s)  c  (1; J  ■  ,Ni  and 
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v"  (s  ) 


-g(s,a(s))v' (s)  +  Aa(s,a(s))v(s)  -  y(s,a(s)) 


for  each  s  e  S  .  Thus  a(s)  will  be  an  admissible  optimal  control  if 
it  can  be  chosen  piecewise  constant.  To  prove  this  choice  is  possible, 
it  suffices  to  show  for  each  s  t  S  the  existence  of  some  6  >  0  such 
that  a(s)  can  be  chosen  constant  on  (s  -  <S,s)  O  S  and  on  (§,S  +  6) 
ns.  We  discuss  only  the  second  case,  leaving  the  other  to  the  reader. 
For  i  -  1,  ~  •  ,N  and  arbitrary  s  e  [r^.r^),  let  v^s)  be  the 
unique  solution  on  S  to 

v^'(s)  ■  -g(s,i)vj(s)  +  Xa(8,i)vi(s)  +  v(s,i) 


and 


v^s)  -  v (s )  ,  v^(s)  *>  v' (s) 

From  differential  equation  theory,  v^(s)  is  piecewise  ana*,  tic  tad 
therefore  analytic  or.  (s,s  +  5)  for  some  6  >  0  -  Hence  for  some 
6  »  0  there  exists  some  integer  j  in  { 1 ,  ,Nj  such  that 
Vj(sj  v^(s)  for  all  i  »  l,’1  ,h'  and  all  s  i  (s,s  +  6)  We  shall 
now  show  that  the  action  a(s)  *•  j  is  optimal  for  all  small  enough 
s  ’  s  , 

For  this  J  and  each  i  ■  1,''’,N  define 
^(s)  -  (6(s,i)  -  6(stJ)  Jv^(s)  -  A  [a  (s  ,  i)  -  a(s,j)]v^(s)  +  [  y  ( s  ,  1 )  -  >(s, 
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and  note  that  ^(s)  is  analytic  on  (s,s  +  6)  for  small  enough 

<5  >  0  «  If  we  denote  (s)  -  v^(s),  then  0^(8)  will  be  the 

unique  solution  on  S  to 


v^(s)  -  -6(s,i)\^(s)  +  Ad(s,i)^1  (8)  +  1^(9) 

satisfying  ^(s)  *  i^(s)  ■  0  ,  For  some  6-0,  •*/  ( s)  is  either 

uniformly  positive,  uniformly  negative,  or  vanishes  identically  on 
(s,s  +  c)  By  differential  equation  theory  and  the  choice  of  j,  we 
conclude  for  some  5  >  0  and  each  i  -  1,  •  *  ,N  that  C'^Cs)  C  for 

all  s  £  (s,s  +  6)  , 

It  follows  that  for  all  small  enough  s  >  s  we  have 


VjO 


min  (e(s,i)v’(s)  -  .\a(sfi)v  (s)  +  y(s,i)]  , 


ievl,‘ 


j 


Vj(s)  *  v(s)  ,  and  v* (s>  »  v' (s) 


Because  of  the  uniqueness  of  solutions  to  this  equation  this  implies 


v(s)  =  Vj(s)  f°r  sma^^  enough  s  >  s  Hence  we  can  choose  a(s)  “  j 

as  the  optimal  action  for  each  small  enough  s  4  s  « 


Cotollary  10  Let  A  =*  if^(s).'  , f  (s);  and  suppose  lor  l  “  1  N 
that  t^Cs)  is  a  bounded,  piecewise  continuous;  vector-valued  function 
on  S  and  the  functions  d(S;f,(s)):  b(s,f  (s));  and  c(s,f^(s>)  are 
piecewise  analytic  cn  S  Then  a  piecewise  continuous,  admissible 
optimal  control  exists 
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Proo£  If  we  set  c(s,i)  ■  c(s,£^(#)),  etc,  so  that  Ag  -  { 1, • :  ,N }  , 

*  • 

than  the  proof  is  immediate. 

1  • 

An  admissible  optimal  control  will  therefore  exist  if,  among 

other  things,  the  map  A  is  finite  valued  and  piecewise  continuous,  •' 

s 

Note  that  the  number  of  available  actions  may  vary  from  state  to  state. 

The  hypotheses  of  Corollary  10  are  satisfied  by  a  variety  of  functions. 

For  example  if  d(s,a)  is  an  analytic  function  of  the  two  variables 
s  and  a,  and  f^(s)  is  analytic  in  s,  then  d(s,f^(s))  is 
analytic  because  the  composition  of  analytic  functions  is  analytic-  On 
the  ocher  hand,  suppose  d(s,a)  is  an  analytic  function  of  s  for  each 
fixed  a,  but  it  is  not  analytic  in  the  two  variables  s  and  a 
togethar.  If  f^fs)  is  a  piecewise  constant  function,  then  d(s,f^(s)) 
is  piecewise  analytic. 

The  following  theorem  exploits  the  fact  that  if  the  optimal  action 
la  unique  for  all  but  a  finite  number  of  s  z  S,  then  a  piecewise  con- 
cinuous  optimal  control  must  exist 

T^aorem_ll_  Let  v(s)  be  the  minimal  expected  discounted  cost,  suppose 
A^  is  a  convex  set  for  all  but  a  finite  number  of  s  t  S,  and  suppose 
d(s-a)  ^ [b (s ,a)v‘ (s)  -  >.v(s)  +  c(s,a)]  is  a  strictly  convex  function 
of  a  for  ail  but  a  finite  number  of  s  ;  S  Then  an  admissible 
optimal  control  exists 

Prco^f  Let  the  unique  (apart  from  a  finite  number  of  points)  control 

a(  )  ;  S  -*■  K  be  such  that  a(s)  e  A  and 

s 


-30- 


"1 

d(s,a(s))  [b(s,a(s))V (a)  -  Xv(s)  +  c(,s,a(s))] 

■  min  (d(s,a)  ^ [b (s ,a)v' (s)  -  .\v(s)  -*•  c(s,a)]' 
at  A 

s 

for  all  s  e  S  -  In  view  of  the  uniqueness,  a ()  is  piecewise  con¬ 
tinuous.  completing  the  proof 

The  proof  of  Corollary  12  is  an  immediate  consequence  of  Theorem 
11  and  is  therefore  omitted ; 

Corollary  12-  If  is  a  convex  set  for  all  but  a  finite  number  of 

s  £  S,  c(s,a)d(s,a)  ^  is  strictly  convex  in  a  for  all  but  a  finite 
number  of  s  S,  and  d(sa)  1  and  b(s,a)d(s. a)’1  are  affine  with 
respect  to  a,  then  an  admissible  optimal  control  exists - 

The  following  theorem  combines  elements  of  the  previous  two 

Theorem  13  Let  Ag  «  [a^ (s) f (s ) ] >  u  compact  interval  in  E  for 

each  s  t  S,  where  a^(s)  an(s)  are  bounded;  piecewise  continuous 

functions  Suppose  dfs.a^Cs)),  b(s,a  (s)),  and  c(s,a  (s))  are 

piecewise  analytic  on  S  for  i  =  1.2  Let  S  be  aeoemposud  into  a 

finite  number  of  intervals  If  S  is  any  such  interval,  then  suppose 

that  one  of  the  three  functions  d(s,a)  ^  b t,s , a)d  (s . a)  \  or 

c(s  a)d(s;a)  ^  is  either  strictly  convex  or  strictiy  concave  in  a. 

and  the  other  two  functions  are  at  tine  in  a-  tor  oxi  s  l  S  Then 

a 

an  admissible  optimal  control  exists 
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Proof  Define  ct(s,a)  -  d(s,a)  \  S(s,a)  ■  b(s.a)d(s,a)  \  and 
y(s,a)  ■  c(8,a)d(s,a)  ^  .  We  can  assume  without  loss  of  generality  that 
a^Cs)  and  are  conci-nuoUS  on  s»  that  o(a,a^(s)),  S^a^fs)), 

and  y(s,ai(s))  are  analytic  on  S  for  i  ■  1,2;  and  that  the  decom¬ 
position  is  the  trivial  one,  S  “  S  There  will  then  be  three  cases, 
corresponding  to  which  of  the  three  functions  a( s,a);  B(s,a)  or 
y(s,a)  is  strictly  concave  or  strictly  convex  Throughout,  we  use.  the 
fact  that  v" (s )  is  continuous  on  S  (Lemma  5,  Chapter  III). 

Case  (i):  y(s,a)  strictly  concave  or  strictly  convex  in  a  - 

If  >(s,a)  is  strictly  concave  in  a,  then  so  is  6(s,a)v'(s)  - 
m(s ,a)v(s)  +  y(s,a)  This  function  is  minimized  by  either  the  action 
a  B  a^(s)  or  a  *  82(3),  so  this  case  reduces  to  the  finite  action 
situation  and  Corollary  10  applies  The  situation  where  y(s,a)  is 
strictly  convex  is  covered  by  Corollary  12 

Case  (ii) ;  o(s.a)  strictly  concave  or  strictly  convex  in  a 

By  reasoning  similar  to  the  above,  an  admissible  control  is 
optimal  provided  v(s)  changes  from  zero  to  a  non-zero  value  only  a 
finite  number  of  times  as  s  increases  from  rQ  to  r^  Suppose  not, 
so  that  in  any  neighborhood  oi  s  t  S,  say,  the  function  v(s)  changes 
9ign  infinitely  often  as  s  *  s  By  continuity,  it  is  necessary  that 

v(g)  **  v'  (s)  ■  min  \(s,a)  =  0  Since  yfs.a)  is  affine  in  a, 

acA- 

s 

min  >(s,a)  *  min  . (s  n.(s))  By  analyticity  min  y(s,a)  is  either 

a-:A  i*l ,  2  a<.A 

s  s 

zero,  positive  or  negative  for  all  small  enough  s  *  s,  we  examine 
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these  three  situations  In  the  first  situation  we  must  have  for  all 
such  s  >  s  that  v(s)  "0,  a  contradiction-  Secondly-  if 

min  y(sra)  *  0  for  all  small  enough  s  '  s,  there  will  exist  a 

aeA 
-  s 

sequence  +  s  such  that  v(Sj)  0,  v'(s^.)  ®  0.  and  v"(Sj)  0 

But  for  any  large  enough  j  and  some  aeA  ,  we  have 

j  s. 


v"(Sj)  «*  -6(sj  ,  3^  )v 1  (sj  )  +  :Xa(Sj )  -  Y(Sj,a^) 


-y(s  .a  )  <_  -  min  Y'Ca^.a)  v  0 
~  J  J  ■  aeA  J 


by  assumption,  a  contradiction-  A.s  the  third  and  final  situation. 

I 

suppose  min  v(s.a)  ‘  0  for  all  small  enough  s  ■  s  There  exists  a 
atA 

s 

sequence  -  s  such  that  v(Sj)  0,  v'(s^)  =  0.!  and  v"(Sj)  .  0 

There  also  exists  a  corresponding  sequence  a.  e  A  such  that  for  all 

3  S3 

large  enough  j,  r(Sj,a^)  ■  0  By  the  cptinaiicyi condition  for  any 


-v"ts  )  -  -  a  a  ( s  , ,  a  )  v  ( s  )  +  >  (s  ;  a  )  •  (s  ,)  ) 

]  j  j  J  J  J  ~  J  J 


Thus  for  all  large  enough  j  we  have  v"(s  )  •  0,  a  contradiction 


Case  (iii).  b(s,a)  strictly  concave  or  strictly  tunvux  in  a 

By  reasoning  similar  to  the  above,  an  admissible  control  is 
optimal  if  v’(s)  changes  from  zero  to  a  non-zero  value  only  a  unite 

-3 


Suppose  not,  so 


number  of  times  as  s  increases  from  rQ  to  r^  - 
that  in  any  neighborhood  of  s  e  S,  say,  the  function  v'(s)  changes 
sign  infinitely  often  as  s  4  s  .  By  continuity,  we  must  have 
v*  (s)  ■  v"(s)  •  0  Define  y(s)  -  v(s)  -  v(s)  so  that  y(s)  ■  y'(s)  ■  0 
and 


y"(s)  ■  -min  {8(s,a)y‘(s)  -Aa(e,a)y(s)  +  y(s,a)  , 

acA 

s 

where  Y(s,a)  •  y(s,a)  -  Xa(s,a)v(s)  -  Note  that  y(s,a)  satisfies 

the  same  hypotheses  as  y(s,a)  -  In  particular,  min  y(s,«)  ■ 

atA 

8 

min  y(s,f  (s))  and  this  function  is  either  positive,  zero,  or  negative 
i-1,2 

for  all  small  enough  s  >  a  „  We  now  examine  a  hierarchy  of  situations 

If  rain  y(s,f.(s))  >  0  for  all  small  enough  s  >  s,  than  by 
i-1,2 

differential  equation  theory  we  have  that  y(s)  >  0  and  y'(s)  0 

for  all  such  s  ,  Thus  there  exists  some  3q  »  s  in  this  neighborhood 
with  y'(sQ)  -  0  and  >'(,9q)  *  0  *  Then  by  Lemma  10  of  Chapter  III  we 
have  y'(s)  >  0  for  all  s  >  Sq  in  the  specified  neighborhood  of  s, 
a  contradiction.  If  y(s,f^(s))  and  yfs^Cs))  are  both  non-positive 
for  all  small  enough  9  >  s^,  then  a  similar  contradiction  is  obtained 

As  the  final  situation  we  need  to  consider,  suppose  "(s.f^fs))  >  0 
buc  yCs.f^Cs))  <  0  for  all  small  enough  s  5  s  (the  proof  for 
y(s,f^(s))  <  0  <  y(s,f2(s))  is  similar  and  left  to  the  reader).  Define 
w(s)  -  y  (s,  f  2  (s) )/ [  >.a  (s,  f  ^  (s) )  ]  Now  w(s)  <  0  for  all  small  enough 
s  >  s,  and  v"(a)  -  0  implies  w(s)  »  0  •  Since  w(c)  is  analytic,  ue 
must  have  w' (s)  <  0  for  all  small  enough  s  »  s  First  we'll  show 
chat  y (s)  <  0  for  all  small  enough  s^s  If  sjs  is  such  that 
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y(s)  _>  0  and  y*  (s)  ■  0  then 

y"(s)  _>  Xa(8,f2(s))y(s)  -  y(B,f2(s)) 

>  Aot(s,f2(s))w(s)  -  y(s,f2(s))  -  0  . 

The  continuity  of  y"(s)  and  y(s)  ^  0  imply  y*  (s)  >  0  for  all  small 
enough  s  >  s,  a  contradiction. 

We  now  assume  y(s)  <  0  for  all  small  enough  s  >  s  and  show 
y*(s)  -  0  implies  y"(s)  has  the  same  sign  as  y(s)  -  w(b)  .  This 
fact  is  immediate  if  f2(s)  an  action  at  this  s  .  Alter¬ 

natively,  if  f^(s)  but  not  f2(s)  °Ptima^-»  then 

-Xa(s,f2(s))w(s)  +  y(s,f2(s))  -  0 

<  -Aa(s,f1(s))y(s)  +  yCs.f^s))  -  -y"(s) 

<  -Aa(s,f 2(s))y(s)  +  y(s,f2(s))  , 

so  y"(s)  <  0  and  y(s)  <  w(s)  o 

We  now  make  the  concluding  arguments  by  considering  the  three 
situations  corresponding  to  the  sign  of  y' (s)  for  all  small  enough 
s  >  s  .  First,  we  have  y(s)  <  0,  so  y'(s)  0  for  all  small  enough 

s  ->  s  leads  to  a  contradiction.  Secondly,  if  yf(s)  ^  0  tor  all  small 
enough  s  >  s,  then  y* (s)  ■  0  implies  f0(s)  is  the  unique  optimal 
action,  because  otherwise  f^(s)  is  optimal,  y"(s)  0,  and,  by  the 

continuity  of  y"(s),  a  contradiction  is  obtained.  Hence  y* (s)  ^  0 
for  all  small  enough  s  >  s  implies  some  admissible  control  is  optimal. 
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This  is  because  if  8(s,a)  is  convex  with  respect  to  a  then 
8(s,a)y'(s)  -  Aa(s,a)y' (s)  +  y(s,a)  is  concave  and  by  earlier  reason¬ 
ing,  the  optimal  control  is  piecewise  continuous  in  this  neighborhood 
On  the  other  hand,  if  8(s,a)  is  strictly  concave  with  respect  to  a 
then  the  optimal  action  is  unique  for  each  small  enough  s  >  s,  in 
which  case  the  optimal  control  is  continuous  in  thia  neighborhood. 

As  the  final  situation,  suppose  y’(s)  assumes  both  positive 

and  negative  values  in  every  neighborhood  of  s  £  By  the  preliminaries 

above,  there  exists  a  sequence  (s  ^  +  b  such  that  y'(s^)  “0;  j 

j  j 

even  implies  y"<Sj)  L  0,  y(Sj)  ^w(s.j),  and  y'  (a)  >_  0  for 
8  e  and  J  odd  fraplies  y"(Sj)  ^  0,  y(Sj)  ^w(Sj),  and 

y'(s)  0  for  s  e  [s^ .  Thus  if  j  is  even  we  have  w(s) 

crossing  y(s)  from  below  as  a  increases  from  to  s^  Since 

y'(a)  _>  0  for  all  such  s  we  must  have  w' (s)  >_  0  for  some  such  s, 
which  is  a  contradiction  for  large  enough  even  j  ,  Hence  in  every 
situation  either  some  admissible  control  is  optimal  or  a  contradiction 
can  be  obtained,  and  Theorem  13  is  proved. 


Corollary  14  Let  the  hypotheses  of  Theorem  13  be  satisfied  except 
that,  if  S  is  an  interval  with  d(s,ni  ^  and  b(s,a)d(s,a)  1  affine, 
then  c(s,a)d(s,a)  *  is  either  affine,  concave,  or  strictly  convex  in 
a  for  all  s  t  Then  an  admissible  optimal  control  exists - 

Proof  The  proof  of  Case  (i)  in  the  proof  of  Theorem  13  goes  through 
without  change. 


-36- 


Corollary  15.  Let  the  hypotheses  of  Theorem  13  be  satisfied  except  that 
if  S  is  an  interval  in  the  decomposition  of  S  then  one  of  the  three 
functions  d(s,a)  \  b(s,a)d(s,a)  \  c(s,a)d(s,a)  ^  is  analytic  in 
(s.a)  jointly  on  S  *  K  and  either  concave  or  convex  but  not  affine 
in  a  for  all  s  c  S^,  and  the  other  two  functions  are  affine  in  a 
for  all  s  e  Then  an  admissible  optimal  control  exists 

Proof .  In  view  of  the  proof  of  Theorem  13;  it  suffices  to  show  that 
if  f(s,a)  is  some  function  which  is  analytic  in  (s,a)  Jointly  on 
S  *  K  and  concave  but  not  affine  in  a  for  all  s  e  S^,  then  f(s,a) 
is  strictly  concave  in  a  for  all  but  a  finite  number  of  s  e  S  (the 
proof  for  f(s,a)  convex  is  similar  and  left  to  the  reader)  Since 

f(s,a)  is  not  affine  in  a,  there  exists  some  s  c  such  that 

.2 

f(st&)  is  not  affine.  The  analytic  function  — yf(s,a)  is  thus  nega- 

3a 

tive  for  all  but  a  finite  number  of  a  c  K  including,  say,  a  t  A- 

32 

It  follows  that  the  analytic  function  — rf(s,a)  is  negative  for  all 

5a  32 

but  a  finite  number  of  s  t  S,  in  which  case  we  must  have  — ^fCsja; 

3  a 

non-zero  and  £(s,a)  strictly  concave  in  a  for  all  but  a  finite 
number  or  s  £  S 


5.  Application:  Control  cf  a  Dam 

Suppose  that  the  water  level  of  a  reservoir  behaves  like  a  sta¬ 
tionary  Markov  process  and  fluctuates  indefinitely  in  a  continuous 
fashion  between  the  numbers  -  r^,  which  correspond  respectively  to 

the  bottom  of  the  reservoir  and  the  top  of  the  dam.  Furthermore, 
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suppose  chat  Che  water  level  can  be  controlled  to  a  certain  extent 
by  discharging  water  from  the  reservoir  and  that  there  exists  a  cost 
or  utility  rate  associated  with  alternative  discharge  rates,.  Finally, 
suppose  that  there  exists  a  second  cost  associated  with  the  water  level 
being  at  a  particular  value  for  one  unit  of  time.  Then  the  problem  of 
optimally  controlling  this  reservoir  system  may  perhaps  be  stated  as 
the  problem  of  optimally  controlling  a  conservative  diffusion  process. 

It  will  be  assumed  that  the  water  level  of  the  reservoir  behaves 
like  a  controlled  diffusion  process  with  reflection,  or  possibly  reflec¬ 
tion  combined  with  adhesion,  at  each  of  the  boundaries  and  r^  . 

The  control  action  corresponds  to  the  rate  of  water  discharge  through 
the  dam  and  the  control  will  be  a  piecewise  continuous  function  of  the 
water  level.  In  addition ,  it  is  assumed  that  the  costs  of  the  reservoir 
system  can  be  represented  by  a  continuous  movement  cost,  that  is,  the 
sum  of  the  control  and  water  level  cost  rates  will  be  a  continuous 
function  of  the  discharge  rate  and  the  water  level.  Thus  the  diffusion 
process  will  be  conservative,  and  in  the  case  of  undiscounted  costs  the 
optimal  control  will  be  that  admissible  control  which  yields  the  minimum 
expected  cost  per  unit  time,  In  the  discounted  cost  case  the  optimal 
control  will  yield  the  minimum  expected  discounted  cost,  which  will  be 
a  function  of  the  initial  water  level. 

This  model  is  essentially  a  generalization  of  one  by  Bather  [1] - 
His  model  assumes  the  reservoir  input  rate  behaves  like  ordinary 
Brownian  motion  with  positive  drift,  that  a  cost  rate  is  associated  with 
alternative  discharge  rates  but  not  with  alternative  water  levels,  and 
that  all  controls  must  be  continuous  functions  of  the  water  level.  In 
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addition,  if  his  water  level  process  becomes  negative  then  he  assumes 
the  water  level  is  actually  zero,  so  chat  pure  reflection  is  impossible- 
Finally,  his  only  optimality  criterion  is  that  of  maxim:'. zing  the 

expected  utility  per  unit  time. 

Example ,  This  example  of  controlling  a  reservoir  uses  the  discounted 

cost  criterion  for  evaluating  optimality  The  state  space  S  and  the 

set-valued  function  A  of  admissible  actions  equal  the  unit  interval 

s  n 

The  diffusion  coefficient  equals  the  positive  constant  A:  and  the 

drift  coefficient  equals  B(1  -  2a),  where  B  is  a  positive  constant 

The  continuous  movement  cost  is  a  convex  quadratic  function  of  the 

2 

water  level,  namely  ps  -  ps  +  q,  where  p  >  0  and  q  are  arbitrary 
numbers-.  The  boundary  conditions,  at  s  =  0  and  s  =  1:  are  pure 
reflection.  The  intuitively  obvious  control  is  to  try  to  maintain  the 
water  level  at  s  =■  that  is,  maintain  a  minimum  discharge  rate 

(a  =  0)  when  the  water  level  is  less  than  j  and  maintain  a  maximum 
discharge  rate  (a  =  1)  otherwise-  If  this  conjecture  is  correct 
then  by  symmetry  the  derivative  of  the  minimal  expected  discounted 

l  1 

cost  will  equal  zero  at  s  =  j  -  Solving  equation  (1)  on  (0  -]  with 
a(s)  =  v'(0)  “  v' (")  -  0,  we  obtain  the  -.elution 
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where 


and 

Similarly . 
we  obtain 

i 


[- » e  2  +  2ff  ^  2  -  ilj  • 

h  I  h 

\ 

_  -A_1B  +Va'2B2  +  4A_1X 

I  . 

_  -A_1B  -Va‘2B2  +  4A~^a' 

2  2 

solving  (1)  on  (•|,1]  with  a(s)  *  1  and  v'c|) 

~<-is  i  o 

v(s)  =  N^e  +  N^e  +  — (ps"  -  ps  +  q) 

2Ap  B  .  .  2B2p  . 

+  —f  -  -r(2ps  -  p)  +  ~r«-  s  £  [ 


OIH 


"e  conclude  that 


After  verifying  that  v(s)  is  continuous  at  s  - 
v(s)  is  the  expected  discounted  cost  corresponding  to  the  admissible 
control  a(s)  ■  0  for  0  <•  s  <  and  a(s)  »  1  for  ^  s  1  i  .  It 

L  i. 

remains  to  show  thac  a(s)  is  optimal,  Since  it  can  he  shown  that 

v(s)  ■  v(l  -  s)  for  all  s  e  [0,-^],  by  (9)  it  suffices  to  show  that 

v ' (s)  ^  0  for  all  s  e  Since  the  continuous  movement  cost 

2 

ps  -  ps  +  q  <  q  for  all  s  c  (0,1),  we  must  have  v(s)  * 

cc 

Je  Altqdt  for  all  s  t  S  .  In  particular,  Xv(0)  c  q  .  Similarxy. 

0 

xv(^)  >  q  -  It  follows  that  v"(0)  <•  0  >-  v"  (-j)  „  We  know  that 

i i  >0,  <  0,  and  L,  -  0  4  If  0,  then  v’(s)  would  be 

concave,  a  contradiction.  Thus,  with  L 2  c  0,  there  exists  some 
s  c  [0,j]  such  chat  v'(s)  is  convex  for  0  s  _  s  and  is  concave 

for  s  ^  s  ^  If  v ' (s)  >  0  for  any  s  e  [0 ,  then  a  contra¬ 

diction  is  obtained-  so  we  must  have  a(s)  optimal  and  v(s)  the 
minimal  expected  discounted  cost. 


6  Application : _ Control  of  Pollution- 

Suppose  that  the  index  of  pollution  is  constrained  to  tali 
between  zero  and  some  positive  number  This  wouia  be  the  case,  toi 
example,  when  dealing  with  an  air  basin  or  a  body  of  water  Assume 
that  a  factory,  a  collection  of  automobiles,  or  a  similar  polluting 
mechanism  wants  to  control  this  index  of  pollution  by  optimally 
choosing  the  amount  of  its  waste  products  that  is  being  emitted  is  a 
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pollucant  as  opposed  to  being  processed  in  a  pollution  free  manner - 
Finally,  assume  chat  there  exists  a  coat  to  the  controller  for  each 
level  of  control  as  well  as  to  each  value  of  the  pollution  index  Then 
the  class  of  controlled  diffusion  processes  used  as  models  of  dam- 
reservoir  systems  may  perhaps  be  used  as  models  of  pollution  systems 

The  state  of  che  process  will  correspond  to  the  index  of  pollu¬ 
tion,  and  the  boundary  behavior,  at  zero  and  the  maximum  index  value, 
will  be  reflection  or  possibly  reflection  combined  with  adhesion  An 
admissible  control  will  be  a  piecewise  continuous  function  of  the  state 
space  that  will  represent  the  portion  of  the  controller's  wastes  that 
is  being  emitted  as  a  pollutant  Presumably,  (i)  the  bigger  the  control 
value  the  smaller  the  control  cost  rate  (less  needs  to  be  processed), 
(li)  the  bigger  the  control  value  the  bigger  the  drift  coefficient,  and 
(ili)  the  bigger  the  pollution  index  the  greater  the  pollutant  cost 
rate  An  optimal  control  will  be  an  admissible  control  which  yields 
either  the  minimal  expected  discounted  cost  or  the  minimal  mean  cost 
per  unit  time 

The  choice  of  the  proper  upper  boundary  condition  is  open  to 
question  One  possibility  other  than  reflection  is  absorption,  with 
che  interpretation  that  in  the  rare  event  the  pollution  ever  reaches  a 
sufficiently  high;  intolerable  level,  then  a  "disaster"  would  occur  at 
some  high  cost  If  it  can  be  assumed  that  the  pollution  index  rarely, 
if  ever,  attains  its  upper  limit  then  the  choice  of  the  upper  boundary 
condition  becomes  moot  This  might  be  the  case  if  the  pollution  index 
is  thought  of  as  the  percent  of  the  natural  medium  which  has  been 
replaced  by  pollutants  The  oxygen  in  an  air  basin  would  never  be 
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completely  replaced  by  smog,  for  example,  so  the  drift  and  diffusion 
coefficients  of  the  corresponding  diffusion  process  model  would  be 
chosen  accordingly.  Then  the  process  would  rarely  be  affected  by  the 
upper  boundary,  so  its  effects  could  largely  be  ignored. 

Example.  This  example  of  controlling  pollution  uses  the  discounted  cost 
criterion  for  evaluating  optimality.  The  state  space  S  and  the  set¬ 
valued  function  Ag  of  admissible  actions  equal  the  unit  interval.  The 
diffusion  coefficient,  drift  coefficient,  and  continuous  movement  cost 
equal  respectively  A,  B(2a  -  1),  and  Cs  +  D(1  -  a),  where  A,  B  C, 
and  D  ore  arbitrary  positive  constants.  Assume  the  boundary  conditions 
are  equivalent  to  pure  reflection.  In  view  of  the  boundary  conditions, 
the  solution  of  (5)  for  the  minimal  expected  discounted  cost  v(s) 
must  be  such  that,  in  some  neighborhoods  of  the  boundaries,  v'(s)  <■  C/2B 
and  the  optimal  control  a(s)  equals  one.  If  a(s)  ■  1  for  all  s  t  S 
then 


£  S  £  « 

,  .  .  1  ,  T  2s  Cs  BC 

v(s)  =>  L^e  +  l2e  +  —  +  — 

A 


where 


C 

A  £  - 


1  -  e 


k2  ' 


2  1 
e  -  e 


"1 


Afc2  \  £2  £1 

e  -  e 


-A_1B  +Va"2B2  +  4A'1'.' 
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and 


-i-  _Va-2-2  '  "-1' 


-A  B 


'B  +  4A  X 


The  derivative  v'(s)  is  then  a  concave  function  that  equals  zero  at 
s  “  0  and  s  ■  1  and  assumes  its  maximum  value  at 


2.  -  SL 

*1  2 


In 


-4L2 


\  «IL1 


If  v'(s)  i  2g'  then  v(s)  equals  the  minimal  expected  discounted  cost 
and  the  optimal  control  is  to  process  none  of  the  wastes  at  any  level  of 
pollution,  that  is,  pollute  as  much  as  possible  Suppose,  on  the  other 
hand,  chat  v*  (s)  >  D/2B,  that  is. 


V  V  C  D 

£lLle  +  *2L2e  +  I  ’  2B 


If  a(s)  -  1  for  all  s  t  ^9o,9i^*  where  0  <  Sq  <  <  1  and 

v:  (sQ)  »  v'  (s^  ■  D/2B,  then  the  derivative  v'(s)  is  concave  on 
[ Sq ; s i J ,  in  which  case  a(s)  "1  is  not  optimal.  Hence  if  v'(s)  »  1 
there  exist  two  numbers  0  <  <  1  such  that  if  s  e  [0;Sq)  U 

(s^.l]  then  the  optimal  control  is  to  pollute  as  much  as  possible,  while 
if  st  (Sq,s^)  then  the  optimal  control  is  to  process  all  of  the  waste 
products 
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7 „  Application;  Concrol  of  a  Queueing  System, 


Suppose  a  queueing  system  is  characterized  hy  having  a  finite 
waiting  room,  that  is,  the  length  of  the  queue  is  less  than  or  equal 
to  some  number.  In  addition,  assume  that  the  length  of  the  waiting  line 
can  be  controlled  by  the  servers,  such  as  by  changing  the  service  rate. 
Then  let  the  number  of  customers  in  this  queueing  system  be  represented 
by  a  controlled  diffusion  process  fluctuating  between  zero  and  the 
capacity  of  the  queueing  system-  The  behavior  of  the  process  at  these 
boundaries  can  be  either  reflection  or  reflection  combined  with  adhesion, 
The  control  action  corresponds  to  the  service  mode,  for  example,  the 
service  rate.  If  there  are  coats  associated  with  alternative  queue 
lengths  and  control  actions,  then  a  control  which  yields  the  minimal 
expected  cost  of  the  diffusion  model  will  be  an  optimal  control  of  the 
queueing  system,  Presumably,  the  queue  length  cost  will  be  an  increasing 
function  of  the  queue  length,  and  controls  with  a  greater  tendency  to 
shorten  the  queue  length  will  be  more  expensive 

An  obvious  shortcoming  of  this  model  is  the  fact  that  the  length 
of  a  queue  is  a  discrete  state  process  whereas  the  diffusion  process  is 
continuous  In  cases  where  this  continuous  state  approximation  is  not 
sufficiently  accurate,  however,  it  may  be  possible  to  construct  a 
diffusion  process  so  that  a  discrete  process  which  can  be  extracted 
from  it  will  have  certain  desired  properties  This  discrete  process  can 
be  defined  as  follows  For  some  positive  integer  N,  let  the  diffusion 

process  be  defined  on  [ 0 , N )  and  let  the  discrete  process  have  >J  +  1 
states  corresponding  respectively  to  the  integers  0,1,>'»,N'  ,  Then  the 
discrete  process  will  occupy  state  i  if  i  '..’as  the  most  recent  integer 


value  attained  by  the  continuous  process  sample  path  The  discrete 
process  will  than  enter  state  i  +  1  (or  i  -  1  )  at  the  epoch  when  the 
continuous  process  first  attains  the  value  i  +  1  ( i  -  1 ) 

There  is  no  assurance  that  the  diffusion  process  can  be  con¬ 
structed  so  that  the  first  passage  times  of  the  extracted  discrete 
process  will  have  some  specific  probability  distribution  In  particu¬ 
lar,  exponentially  distributed  first  passage  times  cannot  gonerally  be 
obtained  However,  the  following  discussion  will  show  that  an  arbitrary 
sec  of  mean  first  passage  times  can  be  represented  by  the  discrete 
process  extracted  from  a  Brownian  motion  with  properly  chosen  boundary 
conditions 

Suppose  chat  the  queueing  system  has  capacity  N  and  that  four 
items  of  data  are  specified;  the  transition  probability  p  from  Btate 
l  to  1  +  1  and  the  mean  occupation  time  6  in  state  1,  for  i  *  1,2, 
N-l  ;  the  mean  first  passage  time  t  from  state  0  to  1  ;  and 

i 

the  mean  first  passage  time  u  from  state  N  to  N  -  1  We  want  to 
calculate  the  diffusion  coefficient  d,  the  drift  coefficient  b,  and 
the  boundary  conditions  r (we  let  ”  61  "  K0  "  K1  " 

so  chat  the  extracted  discrete  process  will  correspond  to  this  queueing 
system  in  the  specified  manner. 

Utilizing  the  fact  that  the  expected  state  of  the  process  upon 
exit  from  the  interval  (i  -  l,i  +  1)  given  initial  state  i  equals 
the  product  of  the  drift  coefficient  and  the  first  passage  time  6, 
we  conclude  that  b  ■  (2p  -  1 ) / 5  Utilizing  standard  diffusion  process 
theory  (see,  for  example.  Mandl  (16,  pp •  100-102))  to  calculate  the  mean 
first  passage  time  in  terms  of  the  drift  and  diffusion  coefficients  we 
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conclude  for  b  /  0  chat  the  diffusion  coerl'icient  d  must  be  the 
unique  solution  to  b  ■  coth(b/d)  -  csch(b/d)  ,  Note  that  a  unique 
solution  for'  d  always  exists  because  coth(.x)  -  cschix)  increases 
monotonically  from  -1  to  1  as  x  increases  from  to  .  If 

we  assume  that  6  <  t,  tTq  ■  1,  and  b  j*  0,  then  by  standard  diffusion 
process  theory  again  the  mean  first  passage  time  from  state  0  to  state 
1  equals  ^(0q  -  -g)  (1  -  e  ^  .  Setting  this  equal  to  t  allows 

one  to  calculate  the  coefficient  describing  adhesion  at  boundary 

r^  ,  Similarly,  if  b  ■  0  then  d  -  1/26  and  Oq  •  t  -  6  -  The  cal¬ 
culation  of  '  and  c^  proceeds  similarly. 

Example.  This  example  of  controlling  a  queue  uses  the  discounted  cost 
criturion  for  evaluating  optimality.  We  have  S  *  [0,N]  and  Ag  -  [0.R] 
The  diffusion  coefficient  equals  the  positive  constant  A,  the  drift 
coefficient  equals  -a,  and  the  continuous  movement  cost  equals 

I 

Cs  +  Da,  where  C  and  D  are  positive  constants^  Thus  larger  control 
values  will  tend  to  shorten  the  queue  length  at  the  expense  of  a  greater 
control  cost-  The  boundary  conditions  are  equivalent  to  pure  reflection 
The  calculations  for  this  example  proceed  similarly  to  those  for  the 
example  in  Section  6.  In  some  neighborhoods  of  the  boundaries  the 
optimal  control  a(s)  must  be  zero.  Ii  a(s)  -  0  tor  all  s  e  S 
then  the  expected  discounted  cost  is 


v(s) 


C 


J 


“t? 

e 


Cs 


t 
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whare  i  -V?i?  e  The  derivative  v’ (s)  attains  its  maximum  value  at 


s  -  N/2,  so  if  v'(N/2)  -  - 


-  2C 


IN 

2 

e  -  e 


-£N 

2 


\ 


UN  -a, 

e  -  e 


N 


+  —  •  D,  then  a(s) 


is  optimal  and  v(s)  is  the  minimal  expected  discounted  cost.  On  the 
other  hand,  if  v* (N/2)  Df  then  there  exist  0  ‘  Sq  <•  <  N  such 

that  if  a  c  [0 1 Sq)  u  (srN]  then  a(s)  ■  0  is  optimal,  whereas  if 


s  c  (sq»s^)  then  a(a)  ■  R  is  optimal. 


8  Application:  Making  Optimal  Investments 

Suppose  the  owner  of  an  investment  fund  has  available  to  him  a 
number  of  alternative  investment  opportunities,  each  of  which  is  charac¬ 
terized  by  a  rate  of  return  and  a  value  of  risk  that  are  constant  with 
respect  to  time  Moreover  suppose  that  the  value  of  the  investment 
fund  is  characterized  by  being  bounded  by  two  numbers.  For  example; 
the  value  might  always  be  non-negative  and  if  the  fund's  owner  ever 
acquires  a  million  dollars,  then  he  would  stop  investing  If  the  owner 
wants  to  make  the  optimal  choice  of  investments  for  every  level  of  the 
fund's  value-  then  his  problem  can  perhaps  be  solved  by  the  consideration 
or  an  appropriate  controlled  diffusion  process 

Let  the  value  of  the  investment  fund  correspond  to  the  state  of 
the  diffusion  process  and  assume  that  the  behavior  of  the  fund  at  the 
boundaries  can  be  represented  by  some  choice  of  diffusion  process 
boundary  conditions  For  example,  the  fund  value  could  behave  like 
refle.tion  at  the  lower  boundary  and  absorption  at  the  upper  one  The 
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map  A  describing  admissible  actions  is  formulated  so  that,  for  each 
s  e  S,  there  is  a  one-to-one  correspondence  between  admissible  actions 
a  e  Ag  and  the  investment  opportunities  which  are  available  when  the 
fund's  value  is  s  .  Let  the  continuous  movement  cost  reflect  the 
utility  to  the  fund's  owner  of  the  fund  being  at  a  particular  level  for 
one  unit  of  time3  and  let  the  costs  associated  with  the  boundary  condi¬ 
tions  be  defined  in  a  corresponding  manner.  Then  the  control  which 
yields  the  minimal  expected  cost  for  the  controlled  diffusion  process 
will  correspond  to  the  optimal  investment  policy. 

The  value  of  risk  associated  with  an  Investment  is  generally 

specified  by  the  variance  per  dollar  invested.  Thus  it  is  convenient  to 

describe  each  investment  opportunity  by  a  pair  (3^,82)  where  a^  >  0 

is  the  variance  per  dollar  invested  and  a2  e.  E  is  the  yield,  that  is, 

rate  of  return.  Then  we  can  let  Ag  be  a  compact  subset  of  E+  *  E  so 

that  each  admissible  action  (a^^)  e  Ag  corresponds  to  some  investment 

opportunity.  Normally,  the  map  A  is  a  constant  with  respect  to  s  c  S, 

s 

but  not  necessarily  so.  Certain  investment  opportunities,  for  example, 
might  be  available  only  to  funds  of  some  minimum  size. 

In  formulating  this  controlled  diffusion  process  investment 
model,  it  remains  to  specify  the  drift  and  diffusion  coefficients.  Given 
a  specific  Investment  opportunity,  the  expected  profit  and  standard 
deviation  per  unit  time  for  a  fund  will  be  proportional  to  the  fund's 
value.  Consequently,  if  the  fund  is  invested  in  opportunity  (a^,a£)  c  Ag 
then  the  appropriate  coefficients  for  the  diffusion  process  model  are 
d(s,a)  ■  s  a^  and  b(s,a)  •  882  . 
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Example •  For  this  optimax  investment  example  we  assume  the  fund's 


value  is  bounded  as  0  '  Tq  s  and  we  want  to  minimize  the 

probability  of  reaching  Iq  before  r^  This  problem  can  be  solved 
by  considering  a  non-conservative  diffusion  process  with  undiacounted 
costs  - 

Suppose  each  boundary  condition  is  pure  absorption  with  coats 

a  =  1  and  *  -  0  Then  with  a  zero  continuous  movement  cost,  the 

minimal  expected  cost  v(s)  is  the  minimum  probability  of  reaching 

Tq  before  r^  when  starting  at  s  Clearly  v'vs)  0..  so  by  (Id), 

we  want  to  choose  a  t  A  so  that 

s 

Ms-a)  b(s  a) 

max 

d(s,a)  d(s,a) 

at-A8 

In  particular,  suppose  Ag  ■  b(s  a)  and  d(s,a)  are  as  formulated 
above  with  A  constant  with  respect  to  s  e  S  If  '  0  for  some 

fSa2 

(a^.-a^)  t  Ag ,  then  max  /  -  \  0  and  this  corresponds  to  a  favorable 

53  at  A  Vs  a,J 

s  1 

game  Note  that  if  several  investments  have  the  same  positive  yield 

2 

then  the  least  risky  one  will  maximize  (sa^/s  a^),  so  conservative 

piay  is  optimal  On  the  ether  hand,,  suppose  a^  l  0  for  all 

(a.  a,)  c  A  ,  this  corresponds  to  an  unfavorable  game,  e  g  ,  a  casino 

In  this  case,  if  several  investments  have  the  same  negative  yield,  then 

2 

the  riskiest  cue  maximizes  (sa2's  a^)  that  is-  bold  play  is  optimal 


-50- 


CHAPTER  III 


MULTI-PERSON  CONTROLLED  DIFFUSIONS 

This  chapter  generalizes  the  concept  of  a  controlled  one- 
dtmensicnal  diffusion  process  by  allowing  the  process  to  be  controlled 
by  N  persons  If  the  process  is  controlled  by  two  persons  with 
opposite  objectives,  then  the  problem  of  optimally  controlling  this 
process  may  be  viewed  as  a  zero  sum,  two-person  game  On  the  other 
hand,  if  the  process  is  controlled  by  N  2  persons  with  possibly 
different  objectives,  then  the  problem  of  optimally  controlling  this 
process  may  be  viewed  as  a  non-zero  sum,  N-person  game, 

The  results  in  this  chapter  are  intimately  connected  with  those 
for  single  person  controlled  diffusions  (see  Chapter  II  and  Mandl  [16]) 
In  addition;  minimax  problems  in  the  theory  of  diffusions  have  been 
treated  by  Girsanor  [10].  The  multi-person  controlled  diffusion 
process  is  formulated  in  the  following  section  the  zero  sure  two- 
persen  game  problem  is  discussed  in  the  succeeding  four  sections;  and 
the  non-zero  sum,  N -person  game  problem  Is  treated  m  the  final  five 
sections.  Both  discounted  and  undiscounted  ..oft?  a.e  considered  for 
both  game  problems,  and  existence  theorems  are  provided  In  addition, 
several  possible  applications  of  multi-person  controlled  diffusions 
are  given  A  major  result  ot  this  chapter  is  that  the  value  of  a  zero 
sum,  two-person  game  is  the  unique  solution  ol  a  differential  equation 


1;  The  Multi-Person  Controlled  Diffusion  Process 


The  multi-person  controlled  diffusion  process  is  formulated  as 

in  Chapter  II,  only  taking  into  account  the  multiple  number  of 

controllers-  Consider  a  diffusion  process  with  state  space  S,  a 

compact  interval  c^e  red^  Hne  Ei  which  is  controlled  by 

N  persons  (integer  N  >_  2  )  For  each  i  =  i,2;  ■  >N,  some  positive 

ni  th 

integer  n^r  and  some  compact  set  ^  C  E  ,  the  l  person  s 

control  is  a  vector-valued  function  on  C  with  range  K.  Let  A*- 

1  s 

be  a  point-to-set  map  from  S  into  such  that  Ag  is  piecewise 

continuous  in  s  in  the  Hausdorff  metric  and  for  each  s  t  S  the  set 

A*  Is  a  non-empty  compact  subset  of  Each  time  the  process  is 

cVi 

observed  in  stare  s  the  i  person  chooses  an  action  from  the 

t  fa 

set  Ag  The  set  of  admissible  controls  for  the  i  person  con¬ 

sists  of  all  piecewise  continuous  functions  a^(s)  on  S  with  range  in 

K.  such  that  the  action  a, (s)  t  for  each  s  c  S 

i  is 

Let  M  *  M  •  M,  *  *  M.,  K  -  K  *  ■  x  K  ,  a(s)  * 

12  »  I  n 

(a  (s).‘  a., (s ) ) ,  and  A  =  (Ax,  ,AN),  so  that  M  is  the  set  of 

1  ■  N  8  S  S 

admissible  controls,  a  function  a(s)  is  an  admissible  control  if  and 

cnly  if  a(s)  t  M,  and  a(  )  t  M  implies  a(s)  e  Ag  icr  each  s  £  S 

Throughout  this  chapter  it  should  be  clear  from  the  context  whether  the 

iecter  a  denotes  an  admissible  control  a  ■  a(  )  -  M  or  an  admissible 

action  a  i  A  for  some  s  S  Ihe  map  A  is  characterized  in 

s  s 

Chapter  II  We  assure  M  f  :■  hereafter  without  further  mention 

The  definition  of  a  multi-person  controlled  dlfiusion  process  is 
a  slight  generalization  of  Mandl’s  [16;  p  157]  controlled  diffusion 
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process..  Let  d(s,a)  be  a  continuous,  positive  real-valued  function 
on  S  x  K  .  Then  for  a(')  c  M  the  piecewise  continuous  function 
d(s,a<9))  is  the  diffusion  coefficient  of  the  process,  Similarly, 
let  b(s,a)  be  a  continuous  real-valued  function  on  S  *  K.  so  chat 
b(s,a(s))  is  the  drift  coefficient  of  the  diffusion  process. 

Following  Mandl.  with  a  given  control  a(  )  e  M  the  multi-person 
controlled  diffusion  process  is  completely  specified  by  the  generalized 
classical  differential  operator 

.2  , 

D  e  d(s,a(s) )~— 2  +  b(s,a(s))~ 

ds 


together  with  Feller's  [7,9]  boundary  condition 


<jv(rj)  +  5j|v(rJ  -  j~v(s)d  (s)j  - 


(-l)J  i.  v'  (r  .  ) 
3  3 


(Dv ) (r  )  *  0, 


J  -  0,1; 


where  v(s)  is  some  function  whose  second  derivative  is  piecewise  con¬ 


tinuous  on  S  At  each  boundary  r^,r^  th.c  tour  non-negative  param¬ 
eters  <.  ,  oJ  .  -  and  v  at  least  one  of  '.-Inch  must  be  positive 

J  j  J  j 

correspond  respectively  to  the  phenomena  of  absorption,  adhesion  reflec¬ 
tion  and  instantaneous  return  Corresponding  to  is  the  probability 


distribution  function  -  (s)  where 


J  * 


1  This  boundary 


condition  is  interpreted  more  fully  in  Chapter  II 

The  multi-person  controlled  diffusion  process  generates  costs 
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according  to  its  sample  path  and  control  (Mandl  [16,  p  148])  With  the 
zero  sum,  two-person  game  problem,  exactly  one  stream  of  costs  is  gener¬ 
ated,  as  is  the  case  with  the  single  person  controlled  diffusion 
(Chapter  II)  But  with  the  non-zero  sum,  N-person  game  problem,  exactly 
N  screams  of  costs  will  be  generated,  wich  one  stream  corresponding  to 
each  controller  The  costs  ot  a  multi-person  controlled  diffusion  will 
be  formulated  below  for  the  N-person  problem,  but  it  should  be  borne 
in  mind  that  the  formulation  for  the  2ero  sum  problem  is  exactly  the 
same.,  except  that  the  subscript  i  relating  the  cost  screams  with  the 
controllers  will  be  dropped 

Each  cost  scream  is  comprised  of  the  same  three  types  of  coat6 

that  were  specified  in  Mandl  [16]  and  Chapter  II  The  continuous  move- 

tVi 

ment  cost  for  the  i  person  is  defined  by  the  bounded,  continuous 

real-valued  function  c^fsa)  on  S  *  *  ••  *  K^,  let  c(s,a) 

denote  che  N-component  vector  of  these  functions  The  cost  for  the 

l1^  person  due  to  instantaneous  returns  from  boundary  is  expressed 

by  the  real-valued  function  .'^(s)  on  S,  which  is  lntegrable  with 

respect  to  u^(s),  let  v_^(s)  denote  the  vector  of  these  functions 

th 

Finally,  the  cost  for  the  l  person  due  to  the  termination  (absorption) 
of  the  process  at  boundary  r^  is  an<^  *j  denotes  the  vector 


of  these  costs 


.  th 


If  C^(t)  is  the  total  or  the  i  person's  costs  generated  by 
the  process  up  through  time  t,  and  C(t)  =  (C^(c)  ,C^,(t))  is  the 

vector  of  these  costs,  then  the  N-component  vector 


oc 

/»- 


V(s)  =  E  I  e_AtdC(t) 
s 
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denotes  the  conditional  expectation  of  the  total  discounted  costs  of 
the  process  given  an  initial  state  s,  a  control  a(>)  c  M,  and  a 
discount  factor  e  a  s  G  -  From  ^iandl  [.16,  p.  119j  we  have  the 
following  result. 

Theorem  1.  The  vector  of  expected  discounted  costs  corresponding  to 
a()  e  M  is  the  unique  function  v(s)  on  S  such  chat  v  (s)  is 
continuous , 

(1)  d (s , a(s) ) v" (s)  +  b(s,a(s))v* (s)  -  Av(s)  +  c(s,a(s))  -  0 

holds  for  every  s  c  (r^.r^)  which  is  a  continuity  point  of  a(s),  and 

(2)  (6j  +  <j)v(rj)  -  Oj  f(v(s)  +  ..j(s))dM.(s)  - 

s’ 

+  a  (Av(r.)  -  c(r  .a(r  )))  -  <  X  -  0  ,  j  -  0,1  , 

J  J  J  J  J  J 

If  the  process  is  non-conservative  and  neither  boundary  is  purely 
adhesive,  that  is 


1 


0  1, 


ther.  by  Mandl  [16,  p.  1 5 2 j  the  vector  v(s)  -  E^CC*-)  of  the  expected 
total  undiscounted  costs  is  finite  and  is  the  unique  solution  of  (1) 
and  (2)  for  X  =  0  If  the  process  is  conservative  °  0)  . 

then  the  total  undiscounted  costs  may  be  infinite  The  vector 
0  =  ,'Gj.)  in  the  following  theorem,  which  is  an  immediate 
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consequence  of  Mandl  (16,  pp.  152-157,  168],  can  be  interpreted  as  the 
vector  of  mean  costs  per  unit  time. 


Theorem  2.  Let  <q  ■  -  0  and  assume  at  least  one  boundary  is  not 

purely  adhesive,  that  is,  rr^  +  9^  +  tt^  +  6^  >  0  .  If  v(s,A)  is  the 
vector  of  expected  discounted  costs  corresponding  to  A  >  0  and  some 
a( O  e  M,  then 

lim  Av(s,A)  *  0  and  lim  t-vCs.A)  »  w(s)  , 

A+0  A+0  dS 


where  0  is  some  vector  independent  of  the  state  s,  and  w(s)  is 
some  absolutely  continuous  vector-valued  function  on  S  >-  Moreover, 

P(lim  t~1C(t)  -  3)  -  1  , 

£-rao 

and  (0,w)  is  the  unique  pair  satisfying 

(3)  d(s,a (s) )w* (s)  +  o(s,a(s))w(s)  -  0  +  c(s;a(s))  -  0 


for  every  s  £  (r^.r^)  which  is  a  continuity  point  of  a(s),  and 


(4) 


s 

j  f  w(v)dy  +  Vj  (s^  dUj  (s)  +  (-l)-* ^ ^w(r ^  ) 


+  o  (c(r  ,a(r^))  -  0)  ■=  0  ,  j  -  0,1 
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2 ,  The  Zero  Sum,  Two-person  Game  Problem, 

In  chis  and  the  following  three  sections  wc  consider  diffusion 
processes  which  are  controlled  by  two  persons  (N  •=  2).  but  which  gener¬ 
ate  single  streams  of  costs.  The  persons  have  opposite  objectives;  the 
first  wants  to  minimize  the  costs  while  the  other  wants  to  maximize  the 
costs  Note  that  this  zero  sum  game  can  be  regarded  as  a  special  case 
of  the  non-zero  sum  game  problem  for  i'  =2  by  letting  the  second 
player's  costs  equal  the  negative  of  the  first  player's. 

We  shall  consider  a  single  stream  of  costs  and  therefore  omit 
the  subscript  on  all  cost  symbols.  For  any  particular  problem,  player  1, 
who  operates  the  first  control  component,  endeavors  to  choose  a  control 
a^C<)  c  so  as  to  minimize  the  expected  costs  generated  hy  the 
process-  Player  2.  who  operates  the  second  control  component,  endeavors 
to  choose  a  control  a0(  )  e  't,  so  as  to  maximize  the  costs  generated 
by  tne  process  By  a  solution  to  this  game  is  meant  some  admissible 
control  which  is  a  saddlepoint  of  the  expected  cost  function  Thus,  if 
player  1  unilaterally  deviates  from  this  optimal  control,  then  the 
expected  costs  cannot  be  decreased  but  they  may  increase-  Similarly, 
player  2  can  unilaterally  only  decrease  the  expected  costs. 

The  following  two  sections  provide  results  respectively  for  the 
discounted  cost  case  and  the  undiscounted  cost  case  The  method  for 
solving  a  problem  is  basically  the  same  in  each  case  A  differential 
equation  is  solved  and  the  solution  is  used  to  determine  the  saddlepoint 
cf  a  function  with  respect  to  all  admissibl e  controls.  If  this  saddle- 
point  exists,  then  it  is  used  to  obtain  an  optimal  control,  that  is - 
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a  solution  to  the  zero  sum,  two-person  game.  Section  S  indicates  a 
possible  application  of  this  model  to  optimal  welfare  policies. 


3.  The  Zero  Sum  Problem  with  Discounted  Coats. 

Let  v(s,a^,a^)  »  v(s)  denote  the  expected  discounted  cost  of 
a  process  corresponding  to  the  admissible  control  a  -  (a, .a^)  £  M  . 

Than  will  be  the  unique  solution  of  (1)  and  (2).  The 

control  a  e  M  is  said  to  be  optimal  if  for  all  a^  e  M.,  all  a2  £  M2, 
and  all  s  e  S  we  have 


v(s,a1#a2)  v(s,a1,a2)  v(s,a1,a2)  , 

in  which  case  v^.a^.aj)  is  said  to  be  the  value  of  the  game.  We  shall 
later  prove  that  the  value  of  a  game,  if  it  exists,  is  provided  by  the 
following 

Theorem  3,  There  exists  a  unique  solution  v(s)  to 


(5)  v"(s)  +  min  max  (d(s,a. ,a~)  [b(s,a, ,a0)v' (s) 

a,  eA-  a,cA^  12  12 

1  s  2  s 


-  Xv(s)  +  c(s,alta2)]}  "  0 


satisfying 


1 


I 

1 

1 

-i 

i 

1 

I 

7 


I 
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(6)  (0j  +  Kj)v(r^)  -  6j  J(v(b)  f  <s))duj  (a)  -  (-l)^n  v'  (r^ ) 


+  o^(Av(rj)  -  Vj)  -  <j>.  -  0  ,  j  -  0,1, 


where 


Y.  ■  min  max^  c(r  ,a  ,a.)  , 
J  a.cAl  a.eA*  *  1  2 

1  rJ  2  rl 


j  -  0,1 


Before  proving  this  theorem,  some  notation  will  be  Introduced  and  a 
number  of  preliminary  lemmas  will  be  proved. 


Define : 


a(s,a1,a2)  ■  d(s,  a^.a^”  , 

6(9,a1,a2)  »  b(9,a1,a2)d(8,a1,a2)_1  , 

Y(s,a]L,a2)  =  c(s,a1,a2)d(s,a1,a2)  1  , 

g1(s,v1,v2)  =  v 2  , 


■r  ’  i’  v 


®  -min 

max 

a,  eA^ 

a„eA^ 

1  s 

2  s 

+  Y(s,alf 

a2))  , 

1*  27  2 


1  *21 


g(s,vlfv2)  - 


g1(s,v1,v2) 

g2(s*v1»v2^ 


We  have  the  following  result  from  Berge  (2,  ppt  115-116] 


Lemma  A ,  Let  X  and  Y  be  compact  topological  spaces.  If  f  is  a 
lower  (upper)  semi-continuous  numerical  function  on  X  *  Y  and  T  is  a 
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lower  (upper)  semi-continuous  mapping  of  X  into  Y  such  that,  for  each 
x;  Tx  ^  then  the  numerical  function  h  defined  by 

h (x)  ■■  sup(f(x,y)  I  y  c  Tx) 

)  ' 

is  lower  (upper)  semi-continuous  on  X  . 

12  —  —  — 

Lemma  5,  If  A  and  A  are  continuous  at  s,  and  (v,  ,vj  are 
- -  S  S  12 

arbitrary,  then  g(  jV^jV^)  is  continuous  in  (s,v.,v2)  at  (TT.v^.v^)  . 

Proof »  For  i  *  1.2  let  C  E  denote  a  compact  set  containing  an 

open  neighborhood  of  .  With  the  notation  cf  Lemma  4,  identify  X 

with  S  x  *  C^r  Y  with  K2>  f  with  6(s  a^,a2)v2  -  Xa (s, a^a^v^ 

2 

+  >(8,3^8^),  and  r  with  A^  .  Concl.de  by  Lemma  4  that  the  numerical 
function 


max  UHs  a,;a,)v,  -  Aa(s,a.  ,a^)v1  +  (s;a  ;a„)} 

.  ~i  i  2  2  1  i  i  12 

a-cA4- 
2  s 

is  continuous  at  (.s.v^Vj)  ^or  a^1  e  .  Repeating  this  reason¬ 
ing  in  a  similar  manner-  conclude  that  g2(s.v2;V2).  and  lienee,  trivially, 
g(srvi:v2) .  are  continuous  in  (srv^,v2)  at  (s.v  v2) 

In  the  following  lemma,  we  use  the  norm  llg  (s v2,v2 )  II 

*  max  1  sup | g^ ( s . v2) | ;  sup | g2 (s .v^, v2) |  )  . 
scS  S-S 

Lenuna  6 -  The  function  g(s,v^.v2)  is  Lipschitnan  with  respect  to 
(v^.v^,  that  is  for  some  positive  constant  L  not  depending  on 


-60- 


on  8,  v1#  or  v2, 


llg(s,v1,v2)  -  gCs.v^Vj)!!  ^  L|I(Vi,y2)  -  (vL, v2) II 
for  every  a  e  S  and  every  pair  (v^Vj),  (vr>v2>  . 

Proof,  Let  s ,  (v^,v2),  and  (vi*v2^  be  arbitrary  and  without  loss 
of  generality  assume  g^s.v^Vj)  ^  g^s^.Vj)  .  Suppose  &1  e  a* 
is  such  that 
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g2(s,v1,v2)  -  g2(s,v1,v2) 


<  max  (e(sta1,a2)v2  -  Aa(8,a1,a2)v1  +  y(e,a,,a2)} 

*2eA; 

-  max  fB  «2)v2  -  Xa(a,a,  ,a2)v.  +  *(■,£, aO) 

a2cA2 

1  S(8,alta2)v2  -  Aa(8,a1,a2)v1  +  vU.a^.a,) 
-{6(s,a1,a2)v2  -  Aa  (a,a1,a2)v1  +  y(a,a  .a2)} 

<  Cxiv2  -  v2l  +  ^C2|v1  -  vj  , 


where  »  max  | 3 (s , , a2 ) ! 
seS 

a1cK1 


and  C?  -  ma^  1  (s^.-a,,)!  a 
stS 

aleICl 

a2£K2 


Thus  the  desired  result  follows  with  L  -  max{l,C.  +  AC,  }  , 

X  « 


In  subsequent  lemmas  we  use  v(s,u2>u2)  to  denote  the  solution 


of  (5) 

on 

S  satisfying  v(rQ,u2,u2)  -  and 

v  (r0,ultu2)  -  u2 

where,. 

of 

3 

course,  v  (s-u^,^)  -  y^vCs.u^u^ 

This  is  not  to  be 

confused  with  the  notation  at  the  beginning  of  this  section.  It  should 
be  clear  from  the  context  whether  the  second  and  third  arguments  of 
v(s.u^,u2)  are  boundary  conditions  or  admissible  controls, 

LenuTja_^7  For  u!  u2  e  (-x,x)  equation  (5)  has  a  unique  solution 
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v(s,u1#u2)  on  S  satisfying  v(rQfu^,u2)  -  and  v'  (ro»ui*u2^  “  u2  * 
Proof,  It  suffices  to  show  the  equation 

g(s,v1,v2) 

has  a  unique  solution  on  S  satisfying  vj^ro^  “  U1  an<^  v2^r0^  “  U2 
because  tlan  v(seu1,u2>  -  v^s)  .  By  differential  equation  theory 
and  the  piecewise  continuity  of  Ag,  Lemmas  5#  6  and  a  result  in 
Edwards  [6,  pp.  153-155]  imply  the  result. 


_d 

ds 


Lemma  8.  The  functions  vCs.u^Uj)  and  v'(s,u1#u2)  are  continuous, 
strictly  increasing  functions  of  u2  with  limits  ±»  as  u2  ±»  for 
each  fixed  s  e  (Tq ,  ]  and  each  fixed  u^  e  (-»,«>)  . 

Proof,  We  first  show  the  function  v'Cs.u^,-)  is  strictly  increasing. 
Suppose  not,  so  that  for  some  u^  e  (-“>,■*),  sQ  e  (r0»ri]*  and  pair 
u2  u  2,  say,  we  have  v'(Sq,u^,u2)  ■  v" (8o,ul’“2^  and  v*(s,u2,u2)  « 
v:(s,u1,G2)  for  all  s  e  [r0>sQ)  .  It  follows  that  v,,(s,u1,u2)  > 
V'^u^,^)  for  some  s  <  Sq  in  every  neighborhood  of  Sq  and 
v(s0,u1,u2)  <  v(s0,ui:G2)  .  But  since  cx(s,alta2)  >  0  and  by  continuity 
we  have  v"(s,u1,u2)  ^  v"(s,ulfu2>  for  all  s  <  sQ  in  some  neighborhood 
of  s0,  a  contradiction.  Thus  u2  <  u2  must  imply  v’U.m^Uj)  < 
v  (s.u^Uj),  in  which  case  v(s,u1>u2)  <  v(s,u1,u2>,  for  each 
s  *  (rQ»ri^  •  The  continuity  of  v(s,u1(u2)  and  v'  (a.u^.Uj)  with 
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respect  to  u2  follows  by  standard  differencial  equation  theory. 

To  show  the  limiting  behavior  of  vCs.u^.u^)  and  its  derivative, 
it  suffices  to  consider  u^  -*■  ®  and  4  0;  the  situation  with 
Ul  >  0  reduces  to  this  case  and  the  proof  with  u^  ~f  ~x  is  simi-.- r. 

For  arbitrary  u^  <0,  sQ  e  (r0ir^K  snd  L  t  (0,=°),  it  suffices  to 
show  v'(80fui;u2)  _  L  for  seme  u2  e  (-“>,*)  -  To  do  this,  we  consider 
the  differential  equation 


(7) 


y"(a)  -  -Cfty’(«)  +  X-'  y(s)  -  C 


where  C0  *  max  | 6(s,a, ,a,) j , 
B  seS  1  l 


al£Kl 

a2cK2 


C  »  max  .  (:j  ,a  ,a_)  |  , 
Y  seS 

*1,K1 

a2£K2 


and  C2  >  0  .  Now  if  y(s;  is  a  solution  of  (7)  with  y(rQ)  * 

u,  £  (-«.“■’)  and  4^(r„)  *  u_  c  then  it  is  easy  to  verify  that 

1  '  ds  U  l 

y  (s)  -  *  as  u2  -*  ®  for  each  s  e  S  .  In  particular:  with 

C*  •  C  min  a(s.a,  a.),  there  exists  some  constant  L.  ’  L  such  that 

2  "  sc S  X'  *  1  " 

“1£K1 


a2£K2 


if  p  c 


ro  +  so 


then  the  solution  to  (7)  satisfying  y(p)  ■  0 


and  y'(p)  _  will  be  such  that  y'  (s)  _  L  for  all  s  c  [ p , ® q )  - 

Also;  with  C-  ■  C  -  max  c*(s,a,  a,);  there  exists  some  constant  u2 
2  scS  A 

VK1 

a2cK2 
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such  chat  the  solution  to  (7)  satisfying  y(rn)  -  u,  and  y'(r  )  -  u 

(j  i  •'  O'  2 


will  be  such  that  y(S)  -  0  for  80me  "  e 


v  VLi 


and  y  ’  (a)  ^  L 


for  all  s  c  [r^,s];  let  y(s)  denote  this  solution  on  [r  ,s] 


We  now  claim  that  v*  <3^,.^)  L  L  because  vfs.u^u^  is 

bounded  below  by  appropriate  solutions  of  (7),  For  example,  suppose 

s  c  [r0,s]  is  such  that  0  »_  vfa.u^Uj)  L  y(s)  and  V  -  y'  (5) 

Then  for  some  (a. ,a„)  e  A- 
1  *  s 


v"(StUl,u2)  -  y"(2)  -  [Ce  -  e(s,ara2)]y’ (s) 

+  X[a(s,alta2)  -  C]v(8,u  ,u  ) 

+  AC[v(s,ullu2)  -  y(s) ]  +  [c^  -  Y(§,ai,a2)]  . 

Note  the  last  term  on  the  right  hand  side  is  positive  and  the  others  are 
non-negative  so  V’d^.u^  >  y"(s)  .  By  continuity,  v"(s;uru2)  >  y’  (S) 
for  all  s  in  some  neighborhood  of  5  .  In  particular,  if  we  let  §  .  r 
then  it  becomes  apparent  that  v'ts.u^)  *  y*(g)  i8  impossible  with 
v(8,u1,u2)«0  for  8  (r0,£j  ,  Thus  v'  (s.u^)  >  y’  (8)  for  each 

such  s  and  there  exists  some  p  t  (rQ,S)  such  that  v(p,u  u  )  »  0 
ond  v'  (p,Ul  u2)  >_  L  . 

Now  let  y(s)  be  the  solutit  n  to  (7)  with  ^(p)  =  0,  y_'  (p)  « 
v  ip,u1,u2)i  and  C2  =>  C  and  note  that  y' (s)  _  L  for  all  s  e  [p ,8  ] 

By  comparing  vCs.u^)  with  z(s)  as  we  did  with  y,s),  we  conclude 
the  desired  result. 
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Lemma  9,  For  fixed  the  function 

V^rl>ul,u2)  ~  fv(3>ui>v 2^dui^) 

S 

is  continuous  and  strictly  Increasing  in  u2  and  diverges  to  i®  as 
u^  ~ m  > 

Proof,  Let  and  s  c  [r^,r^]  be  arbitrary.  By  Lemma  8, 

v'  (Sju^u^)  is  increasing  in  u2>  so  -  vU.u^ui)  * 

v(rrui:u2)  -  v(s,uru  )  ,  The  function  in  this  lemma  is  JuBt  a  convex 
combination  of  the  right  hand  side  of  this  inequality,  so  by  this  ine¬ 
quality  this  function  is  increasing  in  u2  , 

Since  v’(s:u^;U2)  -*  as  u2  ■'  f00,  we  have  for  any  s  e  (r^rr^) 

that  v(r^;U2»u2)  -  vCs.-u^.u^  ->■  ♦*  as  u2  -  ,  Thus,  by  the  convex 

combination  argument  the  function  in  this  lemma  has  the  same  limits. 
Continuity  follows  from  the  continuity  of  v(s,u^.u2) 

Lemma  10  Let  8(s),  o(s)  and  v(s)  be  measurable  real-valued  functions 

on  S  with  |S(s)l  >  C„  *  o(s/  '  C  >  J-  and  v (s)  0,  and 

-  P  —  - 

auppose  for  some  s  t  (rQ,r^).  >  0,  and  u,  _>  0.  the  function  v(s) 

is  a  solution  to 


v"(s)  =  6(s)v  (s)  +■  a(s)v(s)  +  v (s) 


satisfying  v(s)  -  u^  and  v' (s)  ■  u2  . 
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Vi^ITTITturTnTrn  mmrnjmiM  nr*  * 


I 


t 

1 


1 


1 


] 


Then  for  all  9  e  (s,r]  we  have  v(e)  >  0  and  v'(s)  >  0 


Proof,  Supposa  there  is  a  smallest  8q  s  such  that  v*  (s^)  «•  0  . 

Then  v(sq)  >  0,  and  by  continuity,  v"(s)  >  0  m  some  neighborhood 
of  Sq  o  Hence  v'(a)  :  0  for  all  large  enough  s  <  s^,  contradicting 


u2  ^  0  and  the  definition  of  Sq  .. 


Proof  of  Theorem  3..  Denote 


l 


Nj  “  6J  J  (s> (s>  +  OjVj  +  •,  j  ■>  0,1  , 


By  Lemma  7  it  suffices  to  show  that  vts^^u^),  the  unique  solution 
of  (5)  with  v(rQ,u1,u2)  »  and  v' (r^u^,^)  «>  u2,  satisfies 


(8) 


(XOj  +  +  <^.  )v(r_j  ,ultu2)  -  0  J  v(s,u1,u2)dyj  (s) 

5 


'(■in  v'(r  ,u.,u,)  -  N 

3  j  1  2  j 


3  -  0,1, 


for  unique  values  of  u^  and  \  ..  There  are  two  cases. 


Case  1:  6q  <*  ^  =  0 


By  (8).  u^  =  Nq/(Ao  +  <q)  .  By  Lemmas  8  and  9,  the  left  hand 
side  of  (8)  for  j  =  i  increases  continuously  and  strictly  from  -* 


to  »  as  u2  increases  from  -»  to  ».  Hence  (8)  for  j  °  1  is 


satisfied  by  a  unique  value  of  u2  » 


Case  2.  0  +  -  0 
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'*  l 


>  r  KUkwiit 


For  P  e  denote 


fp<8,v1,v2) 


-min^  max  {6(s,a2,a2)v2  -  \a(s,a^,a2)v^  +  Py (s^a^.a^  } 

a.eA-  a-eA^  i 

1  s  2  s 


and  let  Vp(s,u^,u2)  denote  the  unique  solution  to 


(9) 


Vp'(8*U.,U2) 


fp(a.vp(8.u1»u2^  Vp(3,u1,u2)) 


and 


(10) 


vp^rn*u-i  »un> 


v^)(rc,u1,u2) 


2  * 


Then  by  differential  equation  theory  VpCs.u^.u,)  and  VpCs.u^.u,)  are 
continuous  in  (P.s^^u^)  .  We  seek  to  show  that  v^(s,u^:u2)  satis¬ 
fies  boundary  condition  (8)  for  a  unique  choice  of  the  pair  up»u2  • 

We  can  rewrite  (8)  for  J  =  0  and  general  P  as 


(ID 


(s  ,u^ 


u2)djQ  (s )  +  t-qu2 


<+  (Up) 


where 


(12)  ?^)  -  (';0  +  °0  +  "0)ul  '  N0  1 

We  first  show  that  for  each  c  (-“>,'»)  and  P  =.  (-xs  c)  there  exists 

a  unique  u2  -  u^F.u^)  satisfying  (11):  But  this  follows  from  Lenina 
8;  because  then  the  left  hand  side  of  (11)  is  continuous  and  strictly 
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as  -*•  ±«  . 


Note  that;  since 


Increasing  in  u2  with  limits  ±« 
both  sides  of  (11)  are  continuous  in 
is  continuous  in  (P.u^)  , 

It  remains  to  show  that  v(s,u^ 
(8)  with  j  ■  1  for  a  unique  value  of 
for  which 


(P.u^Uj),  the  function  U2<P,u1) 


)  -  v^Cs.u^.Ujd.u^))  satisfies 
u^,  that  is,  there  is  a  unique 


(13)  (*o.  +  6 


1  +  <1)v(r1,u1)  -  0X  /v(s,uL 


)du1(s) 


V 


(rl'ul)  • 


We  first  show  that  some  u^  c  (-“,*)  satisfies  (13).  Since  the  left 
hand  side  of  (13)  is  continuous  in  u^,  it  suffices  to  show  that  the 
left  hand  side  of  (13)  diverges  to  ±°°  as  u^  **  .  We  discuss  only 

the  case  u^  •*  +*  since  the  other  is  similar. 

Wa  show  this  result  by  considering  the  limit  of  Pv(s,P  ^)  as 
P  *  0  .  To  this  end,  for  P  >  0  denote  u(P)  *>  Pu2(l,P  )  and 
0(P)  =  P<t  (P  *)  ■  Now  Pv1(s,P  *,u2)  =  Vp(s,l,Pu2>,  so  Pv(s,P  )  - 
Vp(s,l;u(P))  o  In  view  of  this  and  (11),  u^  ■  u(P)  is  the  unique  number 
satisfying 

(1M  60 J vp(s,l,u2)du0(s)  +  ^Qu2  -  j,(P)  , 

S 

Since  v(P)  has  a  limit  as  P  +  0,  which  we  denote  by  ^(0),  equation 
(14)  has  a  unique  solution  u2  for  P  ■  0  „  Since  Vp(s,l,u2)  and 
( P )  are  continuous  in  (P.s,u2),  it  follows  that  u(P)  is  continuous 
in  P  and  has  the  limit  u2  as  P  *  0;  we  denote  u(0)  *  u2  .  In 
summary,  Pv(s,P  ■*  v^ (s,  1  u(0) )  as  P  *  0  . 

We  are  now  in  a  position  to  show  that  the  left  hand  side  of  (13) 


diverges  to  ±°°  as  •*  t«°  .  If  — ^-(Tq,  l6u(0) )  ^  0,  then  Lemma  10 
applies  and  v'(s,l,u(0))  >  0  for  each  s  c  ^ro,rl^  *  c^e  ot^er 

handt  if  Vg(r0,l,u(0))  <  0,  then  by  (14) 

90  /v0(*,l,u(0))du0(«)  -  XcQ  +  <Q  +  6q  -  V'o<r0,l.u<0))  .>  6Q  . 

S 

Now  0^  >  0.  for  if  not  then  ^  >  0  and  by  (14)  (r^ : 1 ,u(0) )  «* 

(\Oq  +  <q)/11q  *  0,  a  contradiction.  Thus  for  at  least  one  9^  (r^,r^) 

where  du^ts^)  >  0  we  must  have  Vq  (s^  ,  l,u  (0) )  >_  1  -  vq  (rQ>  1 » u  (0) )  » 

Ic  follows  for  some  sQ  c  [r^s^]  that  vQ(s0 , 1 ,u(0) )  ■»  0  and 
vl  (Sq,1,u(0))  _  0  .  Applying  Lemma  10,  we  conclude  for  all  s  t  (8q  r^] 
that  v0(s;liu(0))  *  0  and  Vq(s.1,u(0))  >  0  . 

Let  Y  oenota  the  left  hand  side  of  (13)  with  vo(stl(u(0))  sub¬ 
stituted  for  v (s , )  .  We  have  by  the  preceding  arguments  that 
Vq (r^  l,u (0) )  >  0  and  v' (r^ , l,u(0) )  »  0  .  Moreover,  for  any 
s  e  [rQ,r^  we  have  vQ (r^ , 1, u(0) )  v  vQ(s, 1 ,u (0) ) ,  so  if  ^  •  0 


ei  vo(ri,1,u^0^  “  _/v0(s,l;u(0)  )d^(s)  ■  0  , 

-  S 


in  which  case  Y  0  ,  Letting  *  in  the  left  hand  side  of  (13), 

we  have 


lim  {('.o  +  e  +  <1)v(r1,u1)  -  J  v  (s  .u^d^  (s)  +  "jV*  (r1  u^)  > 

S 
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■  lim  -k  (>.a.  +  S1  +  x.  )v  (s,l,u(P)) 
P+0  1  1  p 


"1 


(s,l,u(P))du^(s) 


+  TiiVp(r1,l,u(P))i 


-  lim  | 
P+0 


+®  , 


Thus,  by  the  remarks  following  equation  (13)  there  exists  some 

e  (-*,»)  which  satisfies  (13);  it  remains  to  show  this  u^  is  unique. 

Suppose  there  exist  two  numbers  Cq  <  and  corresponding  solu¬ 
tions  vQ(s)  and  v^s)  of  (5),  (6)  such  that  vQ(r0)  CQ  and 
vl(Vo)  =  Cx  .  Let  the  ltorel  measurable  function  a^(s)  from  S  into 

K,  be  such  that  a,(s)  ■:  A1  and 
1  Is 


min 

a^tA 


1 

s 


max  (6(sIai  a^v^fs) 

a„cA2 
2  s 


>.a(s,a1.a2)v0(s)  +  y(sJa1,a2)} 


»  max  {0(s,a^(s)  .a^Jv^is)  -  A>(s,a1(s)  ta2)vQ(s)  +  y  (s  (s) , a,,) } 

a  a2 

2  s 


tor  each  s  e  S  ,  Let  the  Borel  measurable  function  a,(s)  from  S 

2 

into  K„  be  such  that  a0(s)  c  A  and 
2  2  s 

max  {  8  (s  ,a^  (s)  t  a2 )  v j  (s)  -  ACiis.a^sJ.a^v  (s)  +  >  (s ,  (s) ,  a^Yi 

a  a2 

2  s 

=  6(s,a^(>);a2(s))v^(s)  -  -\i  (s  ,a^  (s)  ,a2(s)  )v  (s)  +  y  (s.a^fs)  ;an  (s) 

for  each  s  e  S  ,  Then 
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0 


■  v^(s)  +  min  max  (8  (s,a2,a2)v^(s)  -  Xa (s t  a^  s*2^vi 

a,  eA^-  a„eA^ 

1  8  2  s 


+  Y (s, a1,a2) } 


<  v"(s)  +  max  {$(s,a.  (s)  ,a.)v' (s)  -  Xa (3 ^s) ,  a2)v1  (s) 

a2e  s 

+  Y  (s  ,a1 (s) ,a2) } 


-  v'^(s)  +  6(sta1(s);a2 

and  similarly, 


(s))v|(s)  -  Xa(ssai(s)..a2(s)v1(s) 
+  Y (s,a1(s) ,a2 (s) ) 


Vq(s)  +  6(s,a1(s)  ;a2(s))v^(s)  -  Xa  (s  ^  (s)  .a,,  (s)  )vQ(s) 

+  Y(s,a1(s),a2(s))  _<  0  . 

Defining  the  Borel  measurable  function 

d2  d 

vis)  =  — 2(v1  -  vQ)(s)  +  6(s.-a1(s)  ,a2(s))--(v1  -  vQ)  (s) 

ds 

-  Xa(s,a1(s),a2(s)) (v1  -  vQ)(s)  . 

we  see  that  *r(a)  2.  0  .  Letting  v(s)  "  v^(s)  -  Vq(s),  we  see  that 
is  a  solution  to 


vn(s) 


-£  (s  :  a^  (s) .  a2  (s)  )v ' (s)  t  >a(s  ,a^(s)  ,a7  (s)  )v(s)  +•  ;'(s) 


satisfying  (by  subtracting  boundary  conditions  (6)) 


v(s) 
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V? 


(15)  (\a 


3 +  6j +  V^V  -  ej  /vl8)dT 


(s)  -  (-l)Jn  v’Crj) 


j  -  0 . 1  r 


as  well  as  v(r0)  =  v(rQ)  ”  ci  “  cq  • 

It  remains  to  show  that  v(s)  cannot  simultaneously  satisfy  all 
three  of  these  boundary  conditions.  Assume  v(r^)  “  ^1  ~  *"0  an<* 
holds  for  J  **  0  -  Let  •  mints  e  S|v(s)  ■  and  v' (s)  ^  0) 

Note  that  exists  because  if  v' (r^)  0  then  *>  r^,  whereas 

if  v*(r0)  <  0,  then  by  (15) 


v(s)  du^vs) 


(a 


J0 


ei  +  "o)v(rO)  ■ 


rov’(ro)  -  dov(ro) 


so  6_  7  0  and  for  some  s,  c  (rn,r,  with  ci*„(s.)  ?  0  we  have 
U  1  U  x  U  1 

v(s  )  v(r^)  in  which  case  c  (r^  s^]  .  By  Lemma  10  we  have 

v(s)  -  0  and  v  (s)  0  for  all  s  r  snr,]  .  In  particular;  the 

leit  hand  side  of  (15)  ia  positive  for  j  ■  1  and  Iheor.m  3  is  proved. 


The  following  than rem  provides  a  necessary  and  suf i lcient 
(.saddiepcint)  cor.dit-or.  :cr  an  adnissicle  aintroi  to  be  a  solution  to 
chc  re.  o  sun.;  tco-n-erson  gr-.ne .  «e  nov.  reve;  t  to  the  original  notation, 
where  vu.a^a^)  denotes  the  expected  discounted  cos:  of  a  process 
corresponding  to  the  control  a  =  (a  ,a_;  c  >. 

X  *. 

Ti.ec rerr.  11.  Let  v(s)  be  t'r.t  unique  solutict  c  (o)  ;  (t).  A  control 
a  -  ta^a^)  c  M  is  optimal  if  and  on_y  if 
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I 


(16) 


max2{d(8,a1(8),a2)“1(b(B,ai(8) ,a2)v* (s)  -  Av(s)  +  c(s,a1 (g) ,a2> ] } 

a~eA 

2  8 


-1 


-  d (s ,a^ (s)  ,a2 (a) )  (bfs.a^s)  ,a2(s))v' (s)  -  Xv(s)  +  c (s,  a^s)  ,a2 (a) )  ] 


-  mlni{d(s,a1,a2(s))“1[b(s,a1,a2(s))v'  (a)  -  Av(s)  c  (s  ,a  s  a2  (a) ) ) } 

a.eAA  1 

1  8 


for  every  s  e  S  which  18  a  continuity  point  of  a  and,  for  j  »  0 
and  j  *  1,  Oj  >  0  implies 


(17) 


max  c(r  ,a  (r  ),a2)  -  c(r.,a.(r  ),jL(r  )) 

a„eA  J  J  J  J  J 

2  rj 


min  c(r  ,a. ,a?(r 
rAl  J  1  L  J 


)) 


aleAr 


Moreover,  if  a  is  optimal  then  v(s)  ■  v(s,a.,a0)  is  Che  value  of  the 

i.  *- 

game ;  1 


Proof ;  Suppose  (16)  and  (17)  are  true.  By  the  theory  of  saddlepoinis 
we  have 


minj^  max2id(s,a1,a2)"1(b(s,a1)a2)v'  (s)  -  Xv(s)  +  c(s  .a^.a-,)  ] ) 

a, eA  a„eAA 
Is  2  s 


-  d(s, a^s^a^e))  ^ [b(s ,  a^ (s )  ,a2  (s ) )v'  (s)  -  Xv(s)  +  c (s ,a^ (s )  ,a2 (s ) )  ] 
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Substituting  these  in  (j)  end  (6),  we  see  that  v(s)  is  also  the  unique 
solution  of  (1)  and  (2),  that  is,'  v(s)  =  v(s,a^,a2)  . 

With  a„(s)  fixed,  in  a  similar  manner  we  see  that  v(s)  is 

2  i 

the  unique  solution  or  (5),  (6)  of  Chapter  II,  that  is,  v(s)  is  the 

I  : 

minimal  expected  discounted  cost  for  an  ordinary  optimal  control  problem 
involving  the  control  a^(s)  .  In  view  of  (16)  and  (17),  we  have  by 
Theorem  6  of  Chapter  II  that  v(s,a^,a2)  1  v(s,a^,a2>  for  each  e 
and  each  .s  c  S  ,  Similarly,  vCs.a^.a,,)  <_  vCs.a^,^)  for  all  a2  e  M2 
Hence  a  is  optimal  and  v(s)  is  the  value  of  the  game. 

Conversely,  suppus'v  a  is  an  optimal  control.  First  we'll  show 

that 


d(s,ai(s),a2(s))~1(b(s,a1(s),a^(s))v'  (s.a^-  a,)  -  -\v(s,a1,a2) 

+  c(s,a1(s),a2(s)) ] 


max  {d(s,a1(s),a2)“  [b(£,a1(s),a;,)v' (s,a1,a2)  -  >.v(s,a1,a2) 

a„eA^ 

2  s 

,  +  c(s,a  (s) ,a0) ] ) 


2.  min  max  {d  (s ,  a^ ,  a2  )-1  [b(s  ,a2  )v ' (s  ta^  ,  a, )  -  Av(s,a^,a2) 


a.eA";  a0cA 
Is  2  s 


+  cCs.a^.a^)] } 


^  max  min  (d(s,a1,a2)  [bCs.a^a^v' (s.a^,^)  -  >.v (s ,a  1  ,a2) 

a-cA2  a.cA1 
2  s  Is 


+  c (s,a1 ,a„) ] } 
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,-l 


>_  min  {d(a,a1,a2(a))  [b  (a .a^ , a2  (s ) )v’  (a ,a2 ,a2>  -  AvCs^.Sj) 

a  c  A 


a,  eA 
1  s 


+  c(s,alfa2(s))]} 


-1 


•  d(a ,a^(a) ,«2 (a ) )  [b (8 ,a1 (a ) , a2 (a) )v' (s ,a1 ta2 )  -  \v(s,alta2> 


+  c(s,a1(a),a2(8))] 


Now  v(s,a^,a2)  •  inf  v(s,a^,a2),  so  by  Theorem  6  of  Chapter  II,  the 

VMi 

last  inequality  is  true;  similarly,  the  first  one  is  true.  The  inequal¬ 
ities  are  true  by  saddlepoint  theory,  so  all  are  equalities.  Similarly, 
if  Oj  >  0,  then 


C(rj’^l(rj)*®2(rj)) 


max2  cicya^r^)  ,&2) 
G  A 

2  b 


min 

a.  cA1 
1  r 


max  c(r  ,e. ,a7) 

a,eA2  J  1  - 

2  r. 


-  min  c(r  ,a  ,a2(r  ))  « 

a,£A*  J  J 

1  r . 


Substituting  .lese  equalities  into  (1)  and  (2),  we  see  that  v(s,a^,a2) 
is  the  unique  solution  of  (5)  and  (6),  that  is,  v(s,a^,a2)  *  v(s)  * 
Substituting  v(s)  for  v(s,a. ,a0)  in  the  above  equalities  yields  (16), 
and  Theorem  11  is  p roved. - 


A  diffusion  process  two-person,  zero  sum  game  problem  can  be 
solved  in  principle  as  follows,  First,  obtain  the  solution  v(s)  to 
(5)  and  (6).  Second,  consider  the  map  F  from  S  into  *  K2  such 

thac  (a^,a2>  c  Ms)  if  and  only  if  a^  e  A^f  c,  £  A2,  and  (a^,a2> 
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is  a  saddlepoint  of  d^.a^a^^tbU.a^a^v' (s)  -  .\v(s)  +  c's.a^a^)  , 

If  0  >  0,  j  ■  0,1,  than  redefine  T(r  )  so  that  (a.  ,a„)  e  !'(r.) 

J  J  1  l  J 

1  2 

if  and  only  if  c  ,  *2  t  A{  ,  and  (a^,a2;  is  a  saddlepoint  of 

^  J 

c vrj , ai : a2 )  «  Note  that  P(s)  ■  $  is  possible  for  some  s  c  S,  in 
which  case  the  game  is  without  solution.  On  the  other  hand,  if  T(s)  i  c 
for  each  s  e  S,  then  v(s)  is  the  value  of  the  game,  even  if  it  cannot 
be  attained.  Finally,  endeavor  to  choose  a  piecewise  continuous  function 
a  (a)  such  that  a(s)  e  r(s)  for  each  s  e  s  , 

The  following  result  is  a  sufficient  condition  for  (16)  and  (17) 
to  be  satisfied  by  v(a)  and  some  Borel  measurable  control  a(s),  that 
is.  for  the  map  mentioned  above  T(s)  i  <f  for  each  s  c  S  The  real¬ 
valued  function  h(z)  on  the  compact,  convex  set  C  C  e"  is  said  to  be 
quasiconvex  if  {z  t  cjh(z)  a)  is  convex  for  each  a  £  E  .  This 
function  is  quasiconcave  if  -h(z)  is  quasiconvex.  Corollary  13  is  an 
immediate  consequence  of  Theorem  12  which,  in  turn,  follows  easily  from 
a  minimax  theorem  by  Sion  [20], 

Theorem  12 ;  Let  v(s)  be  tho  unique  solution  of  (3),  (6)  and  suppose 

is  convex  for  each  s  0  S,  i  =  1,2,  Then  there  exists  some  Borel 

measurable  control  a(s)  =  (a  (s) ,a, (s) ) ,  with  a,(s)  c  A1  and 

J  9 

2 

a2(s)  a  As  for  each  s  c  s.  which  satisfies  (16)  and  (17)  provided 
d(s,a1,a2)  1[b(s,a1,a2)v’ (s)  -  Xv(s)  +  c(s,a  a,)] 

and  c^rj ,ai,a2^ *  j  =0,1,  are  quasiconvex  in  c  for  each 

2  2 
ao  c  A  en^  s  t  S  and  are  quasiconcave  in  a.,  e  A^  for  each 
*■  3  Z  s 


a,  e  A^-  and  s  e  S  . 

i  8 

Corollary  13.  Let  v(s)  be  the  unique  solution  of  (5)t  (6)  and  suppose 


A  is  convex  for  each 

s  c  S,  1  *  1,2  .  Suppose  d(s;a^, 

a2>  is  con- 

scant  with  respect 

to 

^al,a2^’  ^^s»a]_»a2^  is  a^fine  with 

respect  to 

(^1 : >  c^8*a1  a2^ 

is 

o 

convex  in  a,  for  each  a.  e  A  . 

l  2  s 

and 

c(s,a^;a2)  Is  concave 

in  a2  for  each  a1  c  A*.  all  for  each  s  e  S 

Then  (16)  and  (17) 

are 

satisfied  by  some  Borel  measurable 

control 

(a^ (a) ,a2 (s) )  with 

al 

1  2 
(s)  £  A  and  a, (a)  e  A  , 

8  /  S 

Example. 

o 

A 

M 

O 

A 

d(s,a1,a2)  - 

A  >  0 

A1  *=  (a  e  E  1 
s  1  ! 

|al 

I  1  z1s),  zl  >  0  b(s,a1,a2)  » 

ala2/s 

2  I 

A  •>  ta~  c  E 
s  2  1 

|a2 

i  z2s},  z2  ®  c(s,a^,a2)  = 

C 

r0  boundary  condition,  v‘ (r  )  =  0  (reflection) 

boundary  condition,  v(r^)  ■>  X ^  (absorption  with  cost  \^) 


For  any  value  of  s,  v(s),  or  v'(s)  we  have 


min  max  A 

a,  cA  a~^A^ 
Is  2  s 


-1 


a,  a„ 

-T-v'(s) 


max  min  A 

a.eA^  a,cA^ 

-a  Is 


-1 


a.a„ 

Vv,<s) 


■,v(s)  +  Cj 

“I 

! 

■v(s)  +  Cj 
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A  [C  -  Xv(s)]  , 


so  a(s)  “  (0,0)  i)  the  optimal  control  for  all  s  e  S  .  Therefore, 

the  value  of  the  game  is  given  by  (1),  (2)  to  be  v(s)  ■  C^etx  +  C2e~tx 

,  , -  tri  ,  2tr->  2trn 

+  C/A,  where  t  “  A/A,  C^  “  e  (A.  -  C/A)/(e  1  +  e  u),  and 

C2  -  etri(A1  -  C/A)/(e2t(ri  “  ^  +  1)  . 


4.  The  Zero  Sum  Problem  with  Undiscounted  Costs. 


The  zero  sum,  two-person  diffusion  process  game  problem  with 
undiscounted  costs  will  be  one  of  two  types,  depending  on  whether  the 
boundary  conditions  are  conservative  or  non-conservative.  The  results 
in  this  section  parallel  those  of  Section  3,  and,  consequently,  they 
will  brief.  The  conservative  case  will  be  treated  in  the  second  half 
of  this  section.  For  the  purposes  of  this  section,  the  boundary  condi¬ 
tions  are  said  to  be  non-conservative  if  at  least  one  boundary  is 
absorbing  and  neither  boundary  is  purely  adhesive,  that  is. 


<0  +  ^i  >  0,  Kj  +  +  0j  >  0 


J  -  0,1  . 


Let  v(s,a^,a2)  ■  v(s)  denote  the  expected  undiscounted  cost  of  a  non¬ 
conservative  process  corresponding  to  the  admissible  control  a  -  (a^»a2) 
g  M  .  Then  v(s,a^,a2)  will  be  the  unique  soltuion  of  (1),  (2)  with 
A  =  0  .  The  control  a  e  M  is  said  to  be  optimal  if  for  all  e 
all  a2  g  M2,  and  all  s  g  "  we  have 
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.» ViiSSE..:  .I-- :  *, 


v(s,a1,a2>  <  vts.i^ij)  ^  v(s,alta2) 

in  which  case  v(s,a^,a2)  is  said  to  be  the  value  of  the  game.  It  will 
subsequently  be  proved  that  the  value  of  a  game,  if  it  exists,  is  pro¬ 
vided  by  the  following  result  whose  proof  is  a  generalization  of  one  by 
Mandl  [16,  pp.  158-167]. 


Theorem  14.  With  non-conservative  boundary  conditions,  there  exists 
a  unique  solution  v(s)  to 


-1, 


(18)  v"(s)  +  min  max2{d(s,a1,a2)  [b(s,a1,a2)v' (s)  +  c(s,a1,a2)]>  -  0 


a0cA_ 

X  9  i  S 


satisfying 


(19) 


(0j  +  ”  ®j  f  +  Vj  (s))dUj  (s) 


-<-i)J"jv'(rj)  -  Yj  -  Vi  ■ 0 


i  -  0,1, 


where 


min  max  c(r  ,alfa2)  , 
a.EAA  a-eA^  J 


i  -  o,i  . 


Proof.  Lemma  7  does  hot  depend  on  X  >  0,  so  for  every  u^,u2  E  (-“,*) 
equation  (18)  has  a  unique  solution  v(s)  satisfying  v(rp)  ■  u^ 
and  v1 (r^)  •  u2  .  For  fixed  u^  and  u2  denote  w(s,u2)  •  v* (s) 
and  note  that  w(s,u2)  is  Independent  of  u^  since  it  is  the  solution 
of  a  first  order  differential  equation  under  the  initial  condition 
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W (r0*u2>  “  u2  * 


Writing  for  j  ■  0,1 


Nj  “  Sj  /vj(3)dpj(5)  +  Cjvj  *  Y 


we  have  that  (19)  is  equivalent  to 

s 

r  i 

(20) 


3 

0ui  “  e0  If  w(t,u2)dtdu0(9)  -  V 
S  r0 


-3  M 
2  0  * 


s 

:1  U1  +  ^91  +  "Y  [  w(c»Ydt  “  ei  JJ  w(t,u2)dtdu1 
1  v<  S  r. 


(a) 


+  ir  v(r  ,u  )  -  . 


Eliminating  from  (20),  we  obtain  the  equation  for  u2  : 


(21)  <Q(0 


rl  f  3 

^  +  <1)  J  w(t,u7)dt  +  KjOq  /  f  w(t,u2)dtdtQ(8) 


+  Vou2  +  VYYY  "  'Yl 


w(t,u2)dtd.1(s)  =  <qN’1  -  <^Nq 


It  remains  to  show  that  (21)  is  solved  by  a  unique  value  of  u7,  since 
then  can  be  obtained  from  (20),-  By  Lemma  8,  w(s,u2)  is  continuous 

and  strictly  increasing  in  u2  and  w(s,u  )  -  : °°  as  u2  •*  in  which 

case  the  left  hand  side  of  (21)  has  these  same  properties  (see  Mandl 
[16,  p-  163J).  Hence  (21)  has  a  unique  solution  and  Theorem  V*  is  proved 


Theorem  15-  With  undiscounted  costs  and  non-conservative  boundary 
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conditions,  let  v(s)  be  the  unique  solution  of  (18),  (19). 
3  "  ^al,a2^  e  M  is  optimal  if  and  only  if 


A  control 


-  min  (d(s 

a.sA1 
1  s 


V-l 


a! ,a2  ^  (b(s,a1,a2(s))v,(8)  +  c(s,a1,a2(s))]} 


for  each  s  c  S  which  is  a  continuity  point  of  a(s),  and,  for  j  -  0 
and  j  *  1,  >  0  implies 


(23) 


ma3S  c(WO.<s> 

a-cA^  J  1  J  2 
2  rj 


c (rj ,al (r j ) ’ a2  (r j ) ) 

min  c (r  ,a1 ,a  (r  ))  , 
a.eA1  J  "  1  J 


Moreover,  if  a(s)  is  optimal,  then  v(s) 
of  the  gome. 


v(s,a^,a2)  is  the  value 


This  proof  is  essentially  identical  to  that  for  Theorem  11,  so  it 
will  be  omitted.  A  diffusion  process  zero  sum,  two-person  game  problem 
in  the  undiscounted  cost,  non-conservative  process  case  can  be  solved, 
in  principle,  in  the  same  manner  as  with  the  discounted  cost  case. 
Moreover,  there  exist  sufficient  conditions  analogous  to  those  of 
Theorem  12  and  Corollary  13  for  equations  (22)  and  (23)  to  be  satisfied 
by  some  Borel  measurable  control  a(s)  . 
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Example. 


S  “  [0,1] 

Ag  -  t-Y,Y]  C  E,  Y  > 
k\  *=  t-Z,  Z]  C  E,  Z  > 

r^  boundary  condition: 
boundary  condition: 


0 

0 


v(r0)  -  XQ 


v'Crp  -  0 


d(s,a1>a2)  =  A  >  0 
b(s,a1,a2)  -  a^a2 
c(s,a2,a7)  =  C 

(absorption  with  cost 
(reflection)  . 


V 


For  any  value  of  s  or  v'(s)  we  have 


min 

a^eA 


1 

s 


max  {A  1[a1a2v' (s)  +  C] > 

a.eA2 
2  s 


max,  min  {A  ^[a.a2v'(s)+  C]}  =  A  , 

a„eAi’  a.  eA^- 
2  s  Is 


so'  a(s)  =  (0,0)  is  the  optimal  control  for  all  s  e  S  .  Therefore, 

1  2 

the  value  of  the  game  is  given  by  (1)  and  (2)  to  be  v(s)  =  -y(C/A )c 
+  (C/A)s  +  AQ  , 


We  now  discuss  the  other  type  of  undiscounted  cost  problem,  the 
conservative  case.  For  the  purposes  of  this  section,  the  boundary 
conditions  are  said  to  be  conservative  if  neither  boundary  is  absorbing 
and  at  least  one  boundary  is  riot  purely  adhesive,  that  is, 
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✓  a.  «•  n  f)  7f  4-  A  -f’  T  +  0  >  () 

0  1  *  0  0  1  1  u  * 


Let  SCa^.a^)  *  0  denote  die  mean  cost  per  unit  time  of  such  a  process 
corresponding  to  the  admissible  control  a  ■  (a^.a^)  c  M  :  Then 
®^al,S2^  t*ie  un^ue  num^er  to  which  there  exists  a  solution  to  (3)  and 

(4),  The  control  a  k  M  is  said  to  be  optimal  if  for  ail  e  and 


all  £  ^2  we  have 


Q  (a^ . )  1  ^©(a^^.a^)  * 


in  which  case  ICa^-a^)  is  said  to  be  the  value  of  the  game.  The 
following  result  characterizes  the  value  of  a  game. 


Theorem  16  With  conservative  boundary  conditions  there  exists  a  unique 
number  3  such  tnat  the  equation 


(24)  w'(s)  +  min  max  ;u(s,a,  a,)  [b (s , a,  , a. )w(s) -  G 


a,  tA  a.i_A 
Is  2  s 


has  a  solution  w(s)  satisfying 


+  c  (s.a^a^)]}  =  0 


w(y)dy  +  ..(s)  d-j.(s)  +  (-i)" "  ,w(r . )  +  o.(y.  -  C)  =  0  , 

3  J  1  J  )  1  j 


j  =  0  1 


where  ~  min,  .ax,  c(r.:a, ,  a.) 

J  a,. .A1'  a,:A“ 

1  r .  c  r  . 

J  1 
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Proof.  This  proof  is  rather  similar  to  that  for  Theorem  3,  so  it  will 
only  be  sketched.  By  Lemma  7  for  every  u^,  G  c  (-o;,OT)  there  exists 
a  unique  solution  wfs.u^jS)  to  (24)  satisfying  w(rQ,U2*3)  =  u^  .  By 
Lemma  8,  w(s,U2»0)  is  continuous  and  strictly  increasing  in  and 

w(s,u2»0)  ■*  ±®  as  U2  ■*  ±°°  .  It  follows  that  the  left  hand  side  of 


(25)  with  w(s,u0,3)  substituted  for  w(s)  is  continuous  and  strictly 
increasing  (decreasing)  in  U2  and  diverges  to  **•  (+«)  as  U2  -*  ±® 
for  j  =  0  (j  =  1)  .  Thus,  if  boundary  r^  is  purely  adhesive,  then 
0  =  y  and  U2  can  be  determined  uniquely  from  the  other  boundary 
condition. 

On  the  other  hand,  if  neither  boundary  is  purely  adhesive,  then 
to  every  0  there  exists  a  unique  number  u2  ■*  u.;(?)  such  that 
w(s,S)  "  w(s , u0 (G) ,0)  satisfies  (25)  for  j  *  0  ,  It  remains  to  show 
that  w(s,9)  satisfies  (25)  for  j  ”  1  with  a  unique  value  of  G  . 
Consider  G  w(s,C)  for  0  >  0  .  It  can  be  shown  as  with  Theorem  3 


w(s,C)  -►  w(s)  as 


for  all  s  t  S,  where  w(s)  is  the 


solution  to 


w'(s)  =  -miu^  max2;id  (s,a2*a2)  [b  (s ,  ,  a?  )w(s )  -  1  ]  / 


a. eA  a,sA 
Is  2  s 


satisfying 


jl 


^  J  w(y)dydLo(s)  +  r.0w(rQ)  -  oQ 


=  0  . 


After  showing  that 
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□ 

01  J  J w(y)dydu1(a)  -  n^w^)  -  «■  0 

S  r0 


we  conclude  that  the  left  hand  side  of  (25)  for  j  ■  1  with  w(s,Q) 
substituted  for  w(s)  diverges  to  +«=  as  G  -*  t°°  By  continuity, 
w(s,0)  satisfies  (25)  for  j  ■  1  with  some  value  G  •  This  solution 
G  is  unique;  otherwise  a  contradiction  can  be  derived  as  was  done  with 
Theorem  3  to  show  the  unicity  of  v(rp)  • 

Theorem  17-  With  undiscounted  costs  and  conservative  boundary  conditions, 
let  Q  be  the  unique  number  such  that  (24)  has  a  solution  w(s)  satis¬ 
fying  (25),  A  control  a  =  (a^,a2)  e  M  is  optimal  if  and  only  if 


(26)  max2{d(s,a1(s) :a2)  1 [b  (s , a1  (s) ,a2 )w(s)  -  G  +  c (s ,a1 (s ) ,a2> ] ) 


a„eA' 
2  s 


=  d(s,a1(s)  ,a2(s) )  [b(s .a^s)  ,a2(s))w(s)  -G  +  c (s .a^s) , a2 (s) )  ] 


=  min^i d (s , a2(s))  1 [b (s , a^ ; a2 (s ) )w(s )  -  G  +  c (s ,a^ , a2 (s) ) ] 

a.eA 
1  s 


for  every  s  e  S  which  is  a  continuity  pcint  of  a(s),  and;  for 
j  =  0  and  j  =  1,  c  0  implies 


(2 ■ )  max.,  c(r,:a,(r  ),a  )  =  c(r  ,a  (r  ),a  (r  )) 

a  J  J  *  jij^j 


mirt.  c  (r  .  a,  a  (r  )) 
a.cA1  J  1  J 

1  t 
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Moreover,  if  a(s)  is  optimal  then 


game. 


G  *=  C(a^,a2)  is  the  value  of  the 


This  proof  is  essentially  identical  to  that  for  Theorem  11,  so 
it  will  be  omitted.  A  diffusion  process  zero  sum,  two-person  game 
problem  in  the  undiscounted  cost,  conservative  process  case  can,  in 
principle,  be  solved  in  the  same  manner  as  with  the  discounted  cost 
case.  Moreover,  there  exist  sufficient  conditions  analogous  to  those 
of  Theorem  12  and  Corollary  13  for  equations  (26)  and  (27)  to  be  satis¬ 
fied  by  some  Borel  measurable  control  a(s)  . 


Example ■ 


S  =  (0,1] 
1 


A*  -  ( -V , Y ]  C  E,  Y  ->  0 
s 


d(s,alta2)  “  A  >  0 


b(s,a1>a2)  °  a1a2 


A*  -  t-z, Z]  C  E,  Z  >  0 
s 


c(s,a^,a^)  ■  Cs 


Suppose  both  boundary  conditions  are  pure  reflection.  For  any  value  of 
s,  w(s),  and  G  we  have 


min  max.tA  2[a  a»w(s)  -  G  +  Cs ] } 

a.sA1  a„cA^  1 

Is  2  s 

max  nin  iA  ^[a  aaw(s)  -  0  +  Cs  ]  i  =  A  '''[Cs  -  G]  , 

a.cA2  a,cAL  1  t 

2  v  Is 
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so  o(s)  -  (0,0)  is  the  optimal  control  for  all  s  e  S  .  'Therefore, 

I 

Che  value  of  this  game  is  given  by  (3),  (4)  to  be  3  ■  C/2,  with 
w(s)  -  (0/A) h  -  i(C/A)s2  , 


5 


Application : _ Optimal  Welfare  Policies 


Suppose  the  problem  of  detarr..'1  ning  some  government's  optimal 
welfare  policy  can  be  posed  as  a  diffusion  process  zero  sum,  two-per9on 
game  as  follows  L^t  the  state  spa„e  correspond  to  some  population  so' 
that  the  state  of  the  process  will  equal  the  number  of  people  receiving 
welfare  Assume  the  boundary  conditions  are  pure  reflection.  Let  the 
first  control  component,  operated  by  the  government ,  be  the  cost,  of 
welfare  per  person  per  unit  time.  Let  the  second  control  component, 
operated  by  the  population,  equal  the  cost  of  civil  disturbances  per 
person  per  unit  time  Finally,  let  the  costs  of  this  welfare  game  be 
represented  by  a  continuous  movement  cost  which  equals  the  sum  of  the 
total  welfare  and  total  civil  disturbance  costs  per  unit  time.  We 
naively  assume  the  civil  unrest  cost  to  the  government  equals  the 
reward  (eg  .  satisfaction)  r.o  the  participants  Thus,  the  total  cost 
to  the  government  equals  the  total  reward  to  the  population  and  the 
government  accs  to  minimize,  while  the  population  acts  to  maximize,  the 
expec  ted  costs  of  this  game 

Presumably,  the  drift  and  diffusion  coefficients  should  reflect 
the  t  act  that  the  greater  lie  welfare  cost  per  person  the  greater  the 
tendency  for  the  number  of  people  receiving  welfare  to  increase 
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1 


Similarly,  these  coefficients  should  reflect  any  tendency  for  the  number 
of  people  receiving  welfare  to  decrease  as  the  civil  disturbance  cost 
is  increased.  This  tendency  would  exist,'  for  example,  if  the  government 
were  to  retaliate  by  more  strictly  enforcing  welfare  elgibility  require¬ 
ments.  ' In  summary,  for  avery  combination  of  controls,  the  number  of 
people  receiving  welfare  can  be  represented  by  a  conservative  diffusion 
process c 


Example.  This  example  involves  undiscounted  costs.  Let  S  -  [0,P]  and 
suppose  A*  -  [O.C^]  for  i  -1,2  .  Let  the  diffusion  coefficient  be 
A  >  0,  the  drift  coefficient  be  the  general  function  b(s,a,,a2)»  and 
the  continuous  movement  cost  a^s  +  a^P  .  By  inspection,  if  a(s)  - 
(0,C2)  and  e  -  C2P,  then  (3),  (4)  have  the  unique  solution  w(s)  »  0 
Moreover,  this  control  a(s)  satisfies  (26)  so  it  is  optimal  and  d 
is  the  value  of  the  game. 


6  The  Non-zero  Sum,  N-person  Game  Proble.m. 

The  remainder  of  this  chapter  describes  a  class  of  controlled 
diffusion  processes  whose  control  problems  can  be  viewed  as  non-zero 
sum,  N-per9on  games.  We  consider  the  multi-person  controlled  diffusion 
process  of  Section  1;  these  processes  are  controlled  by  N  persons  and 
generate  N  streams  of  costs  (N  2)  .  Controller  i  (i  -  1,-  ,N), 

who  operates  the  iC^  control,  endeavors  to  choose  a  control  a^  t  M 
so  as  to  minimize  the  costs  ot  the  itn  cost  stream  generated  by  the 
process.  A  game  situation  exists  by  virtue  of  the  fact  that  the  cost  to 
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the  person  is  Influenced  by  the  actions  of  the  other  players 

The  optimality  criterion  used  for  these  processes  ia  t^at  of  a 
Nash  equilibrium  point,  If  an  admissible  expected  cost  is  defined  to 
be  the  expected  cost  corresponding  to  some  admissible  control,  then  the 
solution  to  this  game  will  be  some  admissible  control  whose  corresponding 
expected  cost  is  a  Nash  equilibrium  point  with  respect  to  all  admissible 
costs  Thus  if  player  i  unilaterally  deviates  from  his  component  of 
this  optimal  control  then  his  expacted  costs  will  either  be  unchanged 
or  increased  The  adoption  of  the  Nash  equilibrium  point  optimality 
criterion  is  made  in  recognition  of  the  fact  that  a  variety  of  meritcr- 
ius  optimality  criteria  exist  for  non-zero  sum.  N-person  game  problems 
In  particular  a  "prisoner's  dilemma"  situation  might  exist  where  the 
prayers  would  gain  by  deviating  from  the  Nash  equilibrium  point  solution 
in  a  cooperative  manner 

The  following  two  sections  provide  results  respectively  for  the 
discounted  cost  case  and  the  undiscounteo  cost  .ase  The  main  result 
of  each  section  is  a  necessary  and  sufficient  condition  for  a  control 
t.  be  optimal  In  addition-  a  method  based  upon  the  theory  of  differ¬ 
ential  games  is  provided  for  solving  a  diffusion  process  non-zero  sum.. 
N-person  game  problem  This  method  is  substantially  the  same  as  a 
method  used  for  sorting  an  ordinary  cnriusicn  process  optimal  control 
problem  The  optimizing  solution  ot  an  equation  io  substituted  into  a 
differential  equation  whose  solution,  in  turn.,  is  used  tc  obtain  the 
optimal  control  The  final  two  sections  indicate  two  possible  applica¬ 
tions  ot  this  rr.cdei  .  .ontio]  cf  pollution  and  optimal  warfare  strategies 

Tc  minimize  ambiguity  the  following  coir.mc  iogv  is  used  The 


M 

1 


X2» 


player  and  is  generally 


CIl 


control  a 


i  £ 


operated  by  the  1 


th 


vector-valued  function  on  S  .  The  control 


-  fa . 


:  N 


)  Is  the 


vector  consisting  of  the  N  players'  controls 


7  The  Non-zero  Suit.  Problem  with  Discounted  Casts 

Lee  v(s,a)  =  v(s..a^.  ,u^)  denote  tne  e.upe.ted  discounted  cost 

or  a  process  corresponding  to  the  admissible  ..untroi  a  •.  M,  and  lit 
C  ri 

v^fs.a)  be  its  i  component,  i  ■  1,  .N  Ihen  ,s  a)  will  be 
the  unique  solution  of  (i),  (2)  The  control  o  M  is  said  tc  be 
optimal,  that  is  a  solution  of  the  game  if  it  is  a  Nc.sh  equi.lib.ium 
point  or  the  expected  discounted  cost  functions,  that  is, 


v  (s  aJ 


vi(s,ar 


i-i 


lti 


V 


ror  all  s  c  S,  all  a  s  M  •  and  cadi  l  -  i  N  In  this  .use 

l  l 

v ( s  a )  is  said  to  be  a  value  of  the  game 
To  simplify  cut  notation  actine 


val  g(a)  -  gu.’  i  is  a  Nash  equilibria'.,  p.ia,  :  g  .  n  2 

gaZ 


whe  x  o 

?..C  £ 

l 

toi  1  =  1  N  '/ 

X 

<« x .  upj  1 1 l tj  I  u n V- 1 1 1 ' n 

6 

•  E1' 

The  rc-ain  ■ --S-.ii  ^  I  uus  sueiior. 

i.-  i i  w  a.  living 

i’ficuie'  18 


is  upt  i:r.ui  ii  siiu  on*>  it  tui  u.n  s 


A  conic. 


which  is  a  continuity  point  of  a(s) 


d(s,a(s))  [b(s,a(s))v’  (s)  -  Xv(s)  +  c(s,a(s))] 


e  val  {d(s,a)"i[b(s,a)v' (s)  -  \v(s)  +  c(s.a)]} 
acA 

s 


where  v(s)  '  v(s,a),  and 


J  *  0,1, 


where 


Y  c  val  c(r  ,a)  , 


J  ■  0,1 


Let  a  optimal  For  arbitrary  i  let  a^,r  ,a^ 


ai+l* 


;aN  be  fixed  sc  that  \.(s,a)  -  inf  v  (s.a, ;  a  a  a  ,a  ) 

aicMi  11  1-11  1+1  N 

is  the  minimal  expected  discounted  cost  of  an  optimal  control  problem 


and  a^  is  one  of  its  optimal  controls  By  Theorem  6  of  Chapter  II 


d(s.a(s)ri(b(s.a(s))vi(s)  -  .-v^s)  +  Ci(s,a(s))] 


min  {d(s,a  (s).  ,a4>-  ,a„(s))'1(b(s,a.  (s) .  ,aJ , 

,  a  ^  A  i 


a ,  t  A 

i  s 


aM(s))v'(s)  -  Vv  (s)  +  c.(s,a.(s),  a. 

1  l  ill 


aN<s>>’ 


for  each  s  o  S  which  is  a  continuity  point  of  a^fs)  and 
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(31) 


(c1(r^ ,a(rj))  -  y  )  »  0 


j  -  0,1, 


where  -  mln^^  ci<r;j  > .  *  *  *  •ai*  *  *  *  *^(rj  > >  •  J  "  O*1 

vAtj 

Since  i  is  arbitrary,  (28)  and  (29)  must  be  true. 

Conversely,  suppose  (28)  and  (29)  hold  and  let  i  be  arbitrary. 

Now  (30)  and  (31)  must  hold,  so  by  Theorem  6  c £  Chapter  II  we  see  that 

v^(s,a)  is  the  minimal  expected  discounted  cost  of  the  ordinary  optimal 

control  problem;  minimize  v  (s ,a  , *  *  * ,a  , * • •  a  )  subject  to  a  e  M 

i  r*  xx 

Since  i  is  arbitrary,  a  defines  a  Nash  equilibrium  point  tor  this 
game  „ 


Theorem  18  is  substantially  different  from  Theorem  11  for  the 
zero  sum,  two-person  game  situation  in  one  respect.  In  each  case  the 
necessary  and  sufficient  condition  is  a  function  of  the  solution  to  a 
differential  equation.  With  Theorem  18,  this  solution  is  explicitly 
a  function  of  some  control  a  e  M,  whereas  in  the  case  of  Theorem  11 
the  corresponding  differential  equation  solution  is  explicitly  indepen¬ 
dent  of  any  control  a  c  M  „  Thus,  given  a  control  a  c  M,  one  can 
determine  v(s.a)  with  Theorem  1  and  then  ascertain  whether  a(s)  is 
optimal  with  Theorem  18.  Conversely,  an  optimal  control  a(s>  will 
satisfy  (28)  and  (29)-  However,  Theorem  18  does  not  provide  an  explicit 
procedure  for  solving  the  diffusion  process  non-zero  sum,  N-person  game 
problem 

The  following  computational  procedure  is  based  upon  a  method 
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devised  by  Starr  and  Ho  [21]  as  well  as  Case  [3]  for  solving  non-zero 
sum  differential  games  Let  t  **  (t^;*  ,^,)  and  u  =  (u^,----,u^)  anc* 

2M  kt 

define  g(s;t,u;a)  :  S  *  E  *  K  -*■  E  by 

g(st,u,a)  ;  d(s,a)_1[b(s,a)u  -  At  +  c(s,a)] 

2N 

Now  consider  the  point-to-set  map  f  ;  S  *  E  -*•  K  defined  by 

r(s,t,u)  =  la  c  A  ig(s,t,u,a)  e  val  g(s,t  u,a)} 

S  a-- A 

s 

If  Cj  >  0  for  j  =  0  or  j  =  1,  then  redefine  f  (r .  t.,u)  so  that 

T(r  ,t;u)  -  {a  £  A  |c(r.,t,u,«;  c val  c(r.,t  u.a)}  If  r(s,t.u)  4  e 
3  rj  J  acA  J 

J 

ior  each  (s,t;u),  then  choose  a  function  a(s,Ltu)  with  a(s,t  u)  t 
i(s.t,u)  for  each  (sc  u),  substitute  c (s , v (s ) ,  v ' (s) )  for  a(s)  in 
(1)  and  (2),  and  solve  for  v(s)  If  v(s)  exists,  then  it  is  a 
value  of  the  game  If  a(s)  =  a(s ,v(s) ,v ' /s))  is  piecewise  continuous, 
chen  it  is  an  optimal  control  and  v(s)  -  v(s,a) 

Note  that  this  procedure  may  break  down  in  three  different  ways: 
f  (s  t.u)  may  not  exist;  v(s)  may  not  exist,,  and  a  (s,v  (s)  ,v ' (s) )  may 
not  be  piecewise  continuous  The  reason  why  vis)  may  fail  to  exist, 

although  a(stt.u)  does  is  that  the  Euclidean  norm  of  g (a ; t ,u,a  (a , t ,u) ) 

’’N 

may  fail  to  be  continuous  on  S  >  Since  most  differential  equation 

theory  existence  theorems  specify  some  form  of  continuity  requirement; 
counterexamples  can  be  easily  constructed  Ihe  following  proposition 
serves  to  characterize 
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I 

I 

I 

( 

J 


J 

I 


Proposition  19,  If  Ag  is  continuous  on  St  then  ?  is  a  closed  map 


on  S  *  E 


2N 


Proof,  For  each  i  **  define  the  point-to-set  map 


r  .  s 


2N 

E  x  K  -  K±  by 


f  (s,t,u,a)  «=  arg  min  g ,(s,t  ura) 


a .  eA* 
i  s 


The  continuity  of  g^(s,t,u,a)  implies  is  a  closed  map-  sc  its 

graph  =  { (s, t ,u,a) | £  P^Cs.t.u.a)}  is  a  closed  sat.  The  map 
is  thus  closed  because  its  graph  H  • * :  n  is  closed. 


Proposition  20,  Suppose  each  component  i  of  g(s  t,u,a)  is  quasi- 
convex  in  e  for  each  s  e  S,  each  a^  t  1C  (j  •»  1 .  •  •  '  : i-1 .  i+1 

M  1 

■■  ;.N).  and  each  t,  u  t  E  ,  and  assume  A  is  convex  tor  each  s  S 

s 


and  i  =  1, 


-N 


Then  f (s:t,u)  +  o  for  each  (s;t,u)  e  S 


-2N 


The  proof  of  Preposition  20  is  omitted  because  it  follows  easily 
from  Rosen  (17]  and  Sion  [20]  If  Proposition  20  holds  and  the  Nash 
equilibrium  point  is  unique  for  each  (s,t-ul,  then  the  closed  map 
i  (s;c,u)  is  simply  a  piecewise  continuous  function  This  observation 

leads  to  the  following  existence  theorem  Following  Rosen  [17],  the 

N  N 

function  g  :  E  -  E  is  said  to  be  diagonally  strictly  convex  for 
a  K  if  for  each  a^,  e  K  we  have 
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T 


where  f (a) 


A  sufficient  condition  that  g(a)  be 


be  diagonally  strictly  convex  is  that  the  symmetric  matrix 
T 

[F(a)  +  F  (a))  be  positive  definite  for  a  t  K,  where  F(a)  is  the 
Jacobian  with  respect  to  a  of  f(a)  (Rosen  [17]) 


Theorem  21  Assume  A  is  continuous  on  S  with  A1"  convex  for  each 

- - -  S  g 

s  £.  S  and  1  ■=  1  •  ,N  Also  assume  d(s,a)  is  constant  in  a  b(s,a) 

is  affine  in  a,  and  c(s,a)  is  diagonally  strictly  convex  in  a,  all 
for  each  s  t  S  Then  a  solution  exists  for  this  diffusion  process 
game . 

Proof  The  function  g(s.t  u,a)  is  strictly  diagonally  convex,  so  by 
Rosen  [17]  there  exists  a  unique  Nash  equilibrium  point  for  each  (s;t,u), 
that  is ;  by  the  above  remarks  ."(s,t,u)  is  a  continuous  function  on 
S  •  E  ,  By  differencial  equation  theory  and  the  arguments  of  Section 
3-  there  exists  a  solution  v(s)  to  (1) :  (2)  with  : (s , v (s) . v ' (s) ) 
substituted  for  a(s)  Hence  a  solution  of  the  game  is  a(s)  = 

!(s.v(s),v  (s))  e  M  and  the  corresponding  value  of  the  game  is 
v(s)  -  v(s.a) 

Exangle  This  example  is  a  twe-person  game.  Let  the  state  space  and 
sets  of  admissible  control  values  equal  the  unit  interval  Let 
d(s,a^;a2)  a  1.-  bCs.a^.-a^)  =>  a^  f  a,,  and  ci(s;a^,a2)  ■  C  +  a^, 
l  55  1;2,  where  C  is  a  constant  Suppose  the  boundary  condition  at 


Tq  is  reflection  and  that  absorption  occurs  at  r^  with  cost  a^ 
Following  the  above  procedure,  we  have  for  i  ■  1,2  that  a^(s,t,u)  ■  1 
for  u^  1  and  a^(s,c:u)  "  0  otherwise  By  symmetry  we  have  that 
v^(s)  ■  V2(s)  is  the  solution  to 

(32)  v'^(s)  «  -2a^(s,v(s),v'  (s))vj,(s)  +  Av^s)  -  K  -  a1(stv(s)  v  (s)) 


satisfying  v1 (rQ)  »  0  and  v(r^)  =  X.  -  In  some  neighborhood  of  r^ 
we  have  v^(9)  >  -1,  so  in  this  neighborhood  a(s  ,v(s)  v' (s))  ■  (0,0) 

^  \  g  _  g 

and,  for  acme  constant  q,  v(s)  **  qe  +  qe  +  C/A  .  If 


v'T  -/a 

Q  ^  Q  P 

A,  - - -  +  — ,  Chen  q  can  be  chosen  so  that  v,  (1)  =  A 

1  —  /  rr  rr\  A  11 


/r|e-'T-  .*1 


and  v ^ ( 8 )  -1  for  all  s  c  S,  in  which  case  a(s)  =  (0,0)  is  optimal 


for  all  s  £  S  ,  If  A^  <- 


/A  -/A 

e  -  e 


•T  |e" 


v'X  y  A 

-  e 


+  — ,  then  for 


some  Sp  t  (0,1)  we  have  v^(Sq)  “  anc^  a(s)  -  (0;0)  optimal  for 

all  s  e  [0,8q)  ,  For  v^(s)  to  exist,  we  must  have  v!^(Sq)  4  0  when 

al  (Sq, v (Sq) .v ’ (a©) )  “  1  in  (32).  But  this  is  easily  verified,  so  for  all 

3  2.  sq  somrt  neighborhood  of  we  have  a(s)  =  (1,1)  optimal  and 

u.s  u~9  _ _ 

v(s)  =  t-j_e  +  t^e  +  (C  +  1)/a,  x^here  =  -1  +  v'T  +  \ , 

u0  =>  -1  -  »'l  +  A,  and  t^  and  x.^  are  constants  It  remains  to  show 

a(s)  =  (1,1)  is  optimal  for  -'ll  9  ’  ■  Suppose  not,  but  that 

<  1,  say,  is  the  smallest  s  >  s^  such  chat  v^(s^)  =  Then 

v»(si)  <  v--(s0)  ‘  0,  a  contradiction.-  The  unknown  constants  q,  t^,  c^, 
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and  Sq  can  be  solved  from  the  boundary  conditions  and  the  fact  that 
v'(s)  is  continuous  with  v' (Sq)  ®  -1 


8 .  The  Non-zero  Sum  Problem  with  Undlsounted  Costs 

The  non-zero  sum,  N-person  diffusion  process  game  problem  with 
undiscounted  costs  will  be  one  of  two  types,  depending  on  whether  the 
boundary  conditions  are  conservative  or  non-conservative  The  results 
in  this  section  paiallel  those  of  Sections  4  and  7,  and,  consequently, 
they  will  be  brief-  The  conservative  case  will  be  treated  in  the  second 
half  of  this  section 

For  the  purposes  of  this  section,  the  boundary  conditions  are 
said  to  be  non-conservative  if  at  least  one  boundary  is  absorbing  and 
neither  boundary  is  purely  adhesive,  that  is, 


•t-  < . 


k  +  1 1  +  t- 

J  J  J 


’  0, 


J  °  0,1 


Let  v(s,a)  =  v(s  a.  ,  ,a^)  =  v(.s)  denote  the  expected  undiscounted 

cost  of  such  a  process  corresponding  to  the  admissible  control  a  c  M 
Then  v(s,a)  will  be  the  unique  solution  of  (1),  (2)  with  J »  0  - 
The  control  a  M  is  said  to  be  optimal  if  it  defines  a  Nash  equili¬ 
brium  point  with  respect  to  the  expected  cost  functions,  that  is, 


v  ( s  a  i 

l 


vi(s"V 


-  a . 

'  i-i 


Wi 


V 
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1 


for  all  8  c  S,  all  e  M^,  and  each  i  ■  1,  '  ,N  In  this  case 
v(s,&)  is  said  to  be  a  value  of  the  game. 

Theorem  22.  With  non-conservative  boundary  conditions,  a  control  a  e  M 
is  optimal  if  and  only  if  for  each  s  e  S  which  is  a  continuity  point 
of  a(s) 


d(s,a(s))“1[b(s,a(s))v' (s)  +  c(s,a(s))] 

c  val  {dCs.aj'^b^aJv'  (a)  +  c(s,a)]}  , 
acA 

s 

where  v(s)  ■  v(s,a),  and 

yc(rra<rj»  -  y  •  °  •  j  *  °’1’ 

where  y.  e  val  c(r.,a),  j  -  0,1  . 

■*  aeA  ^ 

rj 

The  proof  is  essentially  the  same  as  that  for  Theorem  18,  so  it 
will  be  omitted.  Moreover,  the  remarks  and  computational  procedure  that 
follow  Theorem  13  apply  to  this  case  as  well 

Example  -  This  example  is  identical  to  that  of  the  preceding  section 
except  that  the  costs  are  undiscounted  Proceeding  in  a  similar  manner, 
we  have  that  v^(s)  ■>  v^Cs)  is  the  solution  to 

v^(s)  -  -2aI(s,v(s) ,v' (s))v^(s)  -  C  -  a^  (s , v  (s) ; v '  (s ) ) 
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satisfying  v|(Tq)  •  0  and  v^(r^)  “  and  chat  (s  ,v(s) ;  v' (s) )  ■ 

82(3, v(s)  ,v|(s) )  »  1  (■  0)  if  v'(s)  -1  (*_  -1)  If  C  1,  then 

2 

a(s)  *  (0,0)  is  optimal  for  all  s  e  S  and  v^(s)  ■  A  +  C(1  -  a  )/2 
If  C  >  1,  then  a(s)  ■  (0,0)  is  optimal  and  v^(s)  •  A  +  C/2  + 

(C  -  1) (exp (-2  +  2/C)  -  1) /A  -  Ca2/2  on  [0,1/C),  and  a(s)  -  (1,1) 
is  optimal  and  v^(s)  ■  A^  +  (C  +  1)(1  -  s)/2  +  exp(-2  +  2/C)(l  - 
exp (2  -  2s) (C  -  1) / A  on  (1/C, 1]  - 


We  now  discuss  the  other  type  of  undiscounted  cost  problem,  the 
conservative  case  For  purposes  of  this  section,  the  boundary  conditions 
are  said  to  be  conservative  if  neither  boundary  is  absorbing  and  at 
least  one  boundary  is  not  purely  adhesive,  that  is. 


+  <  . 


vo 


+  ri  + 


Let  3(a)  ™  3(a.  ,  ,a..)  -  0  denote  t he  vector  of  mean  costs  per  unit 

i  N 

time  of  such  a  process  corresponding  to  the  control  a  e  M  Then 
a)  is  the  unique  vector  to  which  there  exists  a  solution  w(s.a)  to 
(3)  and  (A)  The  control  a  t  M  is  said  to  be  optimal  it  it  defines 
a  Nash  equilibrium  point  with  respect  to  the  mean  costs,  that  is, 


(a) 


°i(al’ 


<ai-l-'ai' 


i+1  ’ 


•V 


for  all  a^  c  M  ,  1  “  1.  N  In  this  case  Ola)  is  said  to  be  a 
value  cf  the  game - 
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Theorem  23  With 1  conservative  boundary  condirions  a  control  a  e  M  is 
optimal  if  and  only  if  for  each  s  c  S  which  is  a  continuity  point  of 
a(s) 

d(s,a(s))"1[b(s,a(s))w(s)  -  £  +  c(s,a(s))J 

e  vai  {d(s,a)  ^(b(s,a)w(s)  -  G  +  c(s,a)]}  t 
acA 

s 

where  w(s)  ■  w(s,a)  and  Q  -  G(a),  and 

o  (c  (r  j , a  (r  j ) )  -  Vj)  -  O',  j  -  0,1, 

where  y  e  val  c(r  ,a),  j  -  0,1  , 

3  aeA  3 

The  proof  is  essentially  the  same  as  that  for  Theorem  18,  so  it 
will  be  omitted  Moreover,  the  remarks  and  computational  procedure  that 
follow  Theorem  18  apply  to  this  case  as  well 

Example  This  is  an  example  of  an  N-person  game  Let  S  -  A*  -  [-1,1] 
for  i  -  1,  ’  ,N  and  all  s  c  S,  d(s,a)  -  1  b(s,a)  =  +  +  a^, 

and  c^(s,a)  -  J  s |  for  1-1,  ,N  ,  Suppose  reflection  occurs  at 
each  boundary-  The  control  a^(s)  -  -1  if  w^s)  __  0  and 

a^Cs)  =  1  otherwise  By  symmetry,  w^(s)  -  -  w^(s)  s0  w^(s)  i9 

the  solution  to 
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/''-Nw^(s)  -  0  +  isj  ,  w^fs)  _  0 

v^(s)  «  l 

L  NWj_(s)  -  c  +  !s,  ,  w^(s)  ;  0 


satisfying  w^(l) 
so  we  verity  that 


«  w  (-1)  -  0 
1 

w  "  7  -N 

i  -  e 


By  symmetry  we  must  have  (0)  0, 

C7  and  that  for  i  ■  1  ■  N 

N 


a^s) 


s  e  [-1,1  ) 

s  c  (0-1] 


9  Application:  Control  cl  Pollution 

Suppose  that  tin-  index  of  pollution  is  constrained  to  fall  between 
zero  and  some  positive  number.  This  would  be  the  ease,  for  example-  when 
dealing  with  an  air  basin  or  a  body  oi  water  Assume  that  a  collection 
cf  N  factories  automobiles,  or  similar  polluting  nechani.ms  contributes 
to  tins  pollution  and  that  each  such  mechanism  can  control  this  index  ct 
pollution  by  choosing  die  amount  of  its  waste  products  that  is  emitted  as 
a  pollutant  as  opposed  to  being  processed  in  a  pollution-free  manner 
Finally,  assume  that  there  exists  a  cost  to  each,  controller  ter  each 
level  of  control  a  a  well  as  to  each  value  of  the  pollution  index  Then  a 
non-zero  sum  N-person  dutusion  process  game  may  perhaps  be  used  as  a 
model  vf  tills  pollution  system 

This  pollution  modei  is  a  generalization  of  one  in  Chapter  11- 
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Thi!  state  of  the  process  will  correspond  to  the  index  of  pollution,  and 
the  boundary  behavior,  at  zero  and  the  maximum  index  value,  will  be 
reflection  or  possibly  reflection  combined  with  adhesion  An  admissible 
control  component  a^(s)  will  bo  a  piecewise  continuous  function  of 
the  state  space  that  will  represent  the  fraction  ot  the  i1'1  control¬ 
ler's  wasta  products  chat  is  being  emitted  as  a  pollutant  Presumably, 
the  bigger  the  fraction  of  wastes  being  emitted-  the  smaller  the  control 
cost  rate  but  the  bigger  the  drift  coefficient  Similarly,  the  bigger 
tha  pollution  index,  the  greater  the  pollutant  cost  rate  An  optimal 
control  will  be  an  admissible  control  which  yields  a  N'ash  equilibrium 
point  with  respect  to  the  expected  costs 

Example  This  example  involves  undiscounted  costs  and  N  polluting 

mechanisms.  Let  8  *  A*  [0,1]  for  i  **  1,  N  so  that  the  i1"'1 

control  component  equals  the  fraction  of  the  polluting  mechanism  s 

wastes  that  is  being  processed  in  a  pollution-free  manner  Let  d(s,a) 

=  1,  b(s,a)  **  -a.  -  -  a,.  -  and  c.(s,a)  c  Cs  +  a  for  i  =  1,  N, 

1  N  i  1 

and  suppose  pure  reflection  occurs  at  each  boundary  Fy  Theorem  23  we 
have 


ai  \s) 


Wj  (s)  1  1 
wi(s)  >  1 


and  by  symmetry  w^(s)  = 
then  0^  =  C/2  and  w^(s) 


=  w„(s)  If 

A 

-  Cs2/2, 


r.  ( s )  =  U  for  all  s  c  S . 
so  this  control  is  optimal  if 


C  <  8 


On  the  other  hand,  suppose  (' 


8  so  that  there  exist 


0 


1  such  that 


c  <  S  c 
0  1 


w  (s^)  =  wi(s1-'  =  1  and  ail's')  =  1  £°r  ald 


s  e  (SqjS^)  .  On  l's^.s^]  we  must  have 


Ns  Cs  +  1  '  £i  C 
w.(s;  =  te  +  ~ — - +  —  n, 

i  **  !SJA* 


where  t  is  determined  from  w^(Sq)  e  w^fs^)  co 


c(Sq  -  s1 .) 

T  Nsi  "% 

N I  e  -  e 


<  0  , 


Since  c  we  have 


Ns 


C (sn  -  s,  )e 

V  x. 


0 


I  Ns.  Ns  ; 
N I  e  1  -  e  I 


■1  , 


bu^  for  large  enough  N  this  equation  wilJ  not  b '  satisfied  by  any 
C  >_  0  -  Hence  if  C  -  8,  an  optimal  connol  may  not  exist 


IQ..  Application;  Optimal  Warfare  Strategies 

Suppose  a  war  between  two  antagonists  is  characterized  by  an  index 
that  varies  continuously  between  two  real  numbers  r^  •  ,  and  this  war 

is  terminated  in  favor  of  the  first  (second)  antagonist  when  this  index 
first  attains  r^(r^)  For  example,  this  index  could  represent  the 
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portion  of  some  land  mass  under  the  control  of  the  first  antagonist  as 
opposed  to  being  under  the  control  of  the  second.  Or  this  index  could 
reprosent  the  portion  of  some  population  that  is  alleglant  to  one  govern¬ 
ment  as  opposed  to  being  alleglant  to  a  second.  Suppose  each  antagonist 
can  control  this  index  by  choosing  alternative  levels  of  fighting  effort* 
Finally,  suppose  costs  to  each  antagonist  are  associated  with  each  of  the 
two  possible  outcomes  as  well  as  with  alternative  levels  of  fighting 
effort  and  the  index  value.  Then  the  problem  of  determinig  the  optimal 
level  of  fighting  for  these  two  antagonists  can  perhaps  be  resolved  by 
consideration  of  a  non-zero  sum,  two-person  diffusion  process  game* 

The  state  of  the  diffusion  process  will  correspond  to  the  v  ire 
index,  and  the  boundary  behavior,  at  r^  and  r^,  will  be  absorption 
or  possible  absorption  combined  with  another  type  of  boundary  phenomenon. 
Let  the  termination  costs  at  each  boundary  be  positive  for  the  loser  and 
negative  for  the  winner  Let  the  control  represent  levels  of  fighting 
effort  for  the  two  players  so  that  each  player's  continuous  movement  cost 
represents  the  cost  to  him  of  the  fighting  levels  and  warfare  index 
being  at  particular  values  for  one  unit  of  time  If  a  termination  at 
boundary  represents  victory  for  player  one-  then  presumably  the 

bigger  the  first  (second)  player’s  control  the  greater  the  tendency  for 
the  warfare  index  to  increase  (decrease)  Similarly,  the  greater  the 
level  of  fighting  the  greater  the  continuous  movement  cost: 


Example  Let  S  «  3  [01) 

and  e^s.a)  -  a^^  for  i  -  1 
sorption  and  that  termination 


and  suppose  d(s,a)  =  1,  b(s,a)  -  -  a^, 

2  .  Assume  the  bounds: y  behavior  is  ab- 
at  r4  represents  defeat  for  player 
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j  +  1,  where  j  ■  0,1,  with  termination  cost  C  >  0  for  the  loser 
and  -C  for  the  winner.  Finally,  suppose  the  costs  are  undiscounted, 
so  we  consider  Theorem  22, 

By  symmetry  we  must  have  a^(s)  x  a£(l  ~  s)  and  v^(s)  ■  v^(l  -  s) 

for  all  s  e  S  .  Moreover,  a^(s)  ■  0  if  v^(s)  -1  and  a^(s)  •  1 

otherwise,  If  C  <_  y  and  a(s)  *  (0,0),  then  v^(s)  =  -2Cs  +  C  so 

v!  (s)  ^  -1  and  a(s)  «*  (0,0)  is  optimal.  If  C  >_  ^  and  a(s)  ■  (1,1), 
_  2 

then  v^(s)  =  — +  (-2C  +  j)s  +  C  so  v|(s)  -1  and  a(9)  -  (1,1)  is 

1  3 

optimal-  Finally,  if  —  <  C  <  — ,  then  the  following  argument  will  show 

that  for  some  sQ  e  (0,i)  where  v|(Sq)  “  _1  we  have  (s)  “  0  optimal 

on  [0,Sq)  and  a^(s)  =  1  optimal  on  (s^.l]  .  On  (0,8q)  we  have 

-s  s~s0 

a(s)  «  (0,1),  so  v^(s)  =  C  +  e  -  e.  ,  On  Uq,1  -  sQ)  we  have 
a(s)  -  (1,1),  so 

vi(s)  =  *T(S  "  so)2  +  c  +  e  °  +  60  '  1  “  s 

On  (1  -  Sq»1)  we  have  a(s)  *  (1,0),  so 

5  2  "s0  ^”S0*”S 

V1(S)  =  5sq  -  y  -  2sq  +  C  +  e  +  (1  -  2sQ)e  -  s 

Soving  the  equation  v^(l)  =  -C  yields  a  unique  ooltuion  for  c  (0,4) 

1  3 

if  j  <  C  <-  so  we  are  done  provided  a(s)  is  optimal.  The  function 

v^(s)  is  concave  c-  (0,1  -  s^)  and  convex  on  (1  -  s^.l],  so 
vj(s)  _>  -1  on  (O.s^,)  and  v’  (s)  -1  on  (s^,l]  provided  v^(l) 

^  -1  -  This  last  inequality  is  true,  so  a(s)  is  optimal. 
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